J . Phys. Chem. 1994,98, 5265-5271
5265
Identifiability and Distinguishability of General Reaction Systems Sandor Vajda' Department of Biomedical Engineering, Boston University, 44 Cummington Street, Boston, Massachusetts 02215
Herschel Rabitz Department of Chemistry, Princeton University, Princeton, New Jersey 08544 Received: September 24, 1993; In Final Form: February 28, 1994'
Identifiability and distinguishability are essential uniqueness features of reaction schemes. Identifiability deals with the problem of determining whether an experiment is able to supply the desired information on the parameters of a model, whereas distinguishability examines the uniqueness of a kinetic model itself within a given class of competing models. Deterministic analysis, assuming error-free and continuous knowledge of measurable quantities, is a useful first step toward establishing uniqueness properties, because deterministic identifiability (distinguishability) is a necessary condition for their existence in any realistic experiment. Deterministic uniqueness conditions have been described in the literature only for first-order reaction systems. This paper extends the method to isothermal systems that include reactions of arbitrary order with mass-action-type rate equations. In contrast to the first-order case with relatively simple examples of unidentifiable and/or undistinguishable systems, all higher order schemes we have studied turned out to be both identifiable and distinguishable from other schemes in a deterministic sense when consideration is restricted to reasonable kinetic experiments. In principle, an identifiable and distinguishable model can be uniquely and completely reconstructed from observable quantities such as the concentrations of stable species, provided that these functions are known continuously and are error-free. In many cases, however, the presence of measurement noise and limitations on the sampling rate make unique reconstruction not possible. Nevertheless, since unidentifiability and indistinguishability of higher order kinetic schemes are of numerical rather than of deterministic origin, the difficulties can be controlled to a certain degree by improving the design of experiments and by raising measurement accuracy.
I. Introduction The ultimate goal of research in chemical kinetics is to deduce the reaction mechanism, that is, to identify a set of reactions and a corresponding set of rate constants that are compatible with the entire body of theoretical and experimental information available on the particular process. It is clear that neither the reactions nor the rate parameters of such a mechanism are necessarily unique.l In this paper we discuss the problem of identifiability, which deals with the uniqueness of the parameters in a given model, and the problem of distinguishability, which questions the uniqueness of a model structure within a class of competing models. In both problems, nonuniqueness can have two different sources. In the first case, there exist two or more parameter values at which a model generates identical profiles for the observable quantities, or the calculated responses are identical for two or more kinetic models. Since nonuniqueness is present even without any measurement noise, the corresponding phenomena are called deterministic unidentifiability and deterministic indistinguishability, respectively.* In the second case, two substantially different parameter values yield computed curves for the observables that are not identical but too close to be discriminated, and the model is said to be is practically unidentifiable. In a practical or nondeterministic framework a mechanism is never unique, since it is always possible to postulate additional steps that involve unstable intermediates which occur rapidly enough so that their inclusion does not significantly alter the computed magnitudes of the observable species.3 In our previous paper,2 relatively simple algebraic conditions were presented for deterministic identifiability and distinguishability of first-order or pseudo-first-order reaction systems. We have shown that rather simple reaction schemes can be either
* Abstract published in Advonce ACS Abstructs, April 1, 1994.
unidentifiable or not uniquely identifiable, unless the concentrations are measured for a complete set of independent species. For example, the rate constants kl and k2 of the first-order consecutive reactions A-B,
B-C
(1)
cannot be uniquely determined from absorbance data of the form
0 = eJA]
+ tg[B] + cc[C]
if the intermediate species B cannot be isolated, and hence the molar extinction coefficient eg is an unknown parameter which is estimated simultaneously with kl and k2. The slightly more complex first-order reaction scheme A e B , B-C
(3)
is unidentifiable under thesameconditions; i.e., there are infintely many parameter vectors k ( k l ,k-1, k2, eg) that generate the same values for the observable 0 defined by eq 2. In this paper we extend the analysis of identifiability and distinguishability to isothermal systems that include higher order elementary reactions with mass-action-type rate equations. The extension became possible due to new results in mathematical system theory, in particular the local state isomorphism the0rem.u While this theorem enables us to formulate identifiability and distinguishability conditions for any nonlinear systems if the right hand sides of the differential equations are sufficiently smooth, the actual tests require nontrivial calculations in the general case.G8 The analysis is much simpler for dynamical systems with polynomial nonlinearities, and here we restrict consideration to isothermal processes with mass-action-type kinetics. Since very simple first-order reaction systems frequently exhibit nonuniqueness (unidentifiability and/or indistinguishability),2
0022-3654/94/2098-5265$~4.50/0 0 1994 American Chemical Society
5266 The Journal of Physical Chemistry, Vol. 98, No. 20, 1994
Vajda and Rabitz
one might expect that introducing higher order reactions would identifiable at k if there exists a neighborhood-W of k in Q such that k is distinguishable from any other k E W. Local lead to further "exotic" phenomena. However, the presence of identifiability is a weaker property th_an unique_ identifiability bimolecular elementary steps actually was found to improue the and does not exclude the existence of k k if k is outside the identifiability propertiesof all kinetic models we have investigated neighborhood W. Nevertheless, if a model is locally identifiable so far. Similarly, all indistinguishable models became distinguishable when a sufficient number of reaction steps were changed everywhere in a bounded parameter set 0, then the problem of to be bimolecular. estimating the parameters from continuous and error-free observations of the given observables has at most a finite number In an apparent contradiction to the above result, it is wellof solutions.llJ2 By contrast, the same response function is known that on the basis of the concentrations of the stable species generated a t an infinite number of different parameter values if in a complex kinetic process it is very difficult to estimate the rate constants of elementary steps with any reasonable a c c ~ r a c y . ~ ~ ~ the J ~ model is unidentifiable. If all reactions are first-order or pseudo-first-order, then the In fact, deterministic identifiability is a necessary but not sufficient condition for practical identifiability, and there is no real kinetic equations are linear, Le., P(x(t,k),k) = A(k) x(t,k), where A(k) is an n X n matrix of first-order rate coefficients. We contradiction. The inability to estimate the parameters in kinetic models in spite of their deterministic identifiability shows that presented two methods for establishing identifiability properties of such systems.2 The first method involves the Laplace transform due attention has to be given to the problem of practical or numerical identifiability, because most frequently the difficulties of the response function and is restricted to the linear case. Here are due to measurement noise, low sampling rate, or inappropriate we show that the second approach, based on the use of statespace similarity transformations,2 can be extended to nonlinear parameter estimation methodology.1° Unlike its deterministic systems. To derive conditions for identifiability by this approach, counterpart, the practical identifiability of a model can be we have to assume that both systems defined by eqs 4 and 5 controlled by experimental design and observation accuracy, a t least to a certain degree. We also show, however, that parameter satisfy conditions for local controllability and observabilityI3 on some open neighborhood of the initial values. These conditions estimation problems in reaction kinetics can be intrinsically illconditioned, and in such cases even relatively small measurement will be discuss_edfurther in the paper. We recall that parameter values k and k are defined to be indistinguishable in the given errors lead to practical difficulties as inherent as the deterministic unidentifiability of the model, although the underlying phenomena experiment if the outputs y and Y of models 4 and 5 , respectively, satisfy eq 6 for all t E [to,t~]. are quite different. We have shown6-*that models 4 and 5 with appropriate local 11. Deterministic Identifiability controllability and observability properties satisfy eq 6 if and only if there exists a linear state-space (coordinate) transformation We restrict consideration to isothermal processes described by x = Tf such that rate equations of the form
-
x(t,k) = P(x(t,k),k)
x(0,k) = xo(k)
(4a)
where P = (PI, P2, ..., P,JT, and each Pi is a polynomial of the concentrations x = (XI, ..., x,)T with coefficients depending on the parameters k = (kl, ..., k p ) T of the model. The parameter vector k is taken from a bounded parameter set Q C Rp. The m-vector y of observables (i.e., the response or output function of the model) is assumed to be a linear function of the concentrations.
where the output matrix C may depend on the parameters. Given a parameter value k, a kinetic experiment is specified by the initial concentration vector xO(k) that may depend on k (e.g., some initial concentrations may be unknown and hence elements of the parameter vector k) and the time interval [to,tl]. Let y(t,k) denote the response function of the model calculated under the conditions of the given experiment. Consider-now another parameter value k E Q. For the parameter vector k the kinetic model takes the form
k(t,k) = P(%(t,ii),k)
2(0,k) = %O(ii)
jyt,k) = C(ii) %(t,k)
(5a)
(5b)
All deterministic identifiability concepts are defined in terms of parameter indistinguishability.' ParameteLvalues k and k are said to be indistinguishable (denoted as k k) in a given set of experiments if
-
y(t,k) = Y(t&
(6)
for all t E [ t o , t l l and in all experiments of the given class. Otherwise k and k are said to be distinguishable. The kinetic model defined by eq 4 is said to be uniquely identifiable at k E Q if k k implies_k = k,Le., k is distinguishable from any other parameter value k E Q. The model is said to be (locally)
-
Tno(k) = xo(k)
(7a)
TP(2,k) = P(T2,k)
(7b)
C(k) = C(k)T (7c) The basic idea of this state-space similarity transformation approach to identifiability analysis is constructing all transformations T that satisfy eqs 7a-c. Each coordinate transfofmation also defines a relationship betwep two points k and k in the parameter space such that k k. If the only transformation that satisfies eqs 7a-c is T = I, where I is the n X n identity mctrix, then the corresponding relationship in the parameter space is k = k, and the model is uniquely identifiable at k. The model is locally identifiable (or simply identifiable) at k if there exists a finite number of feasible transformations T , and hence a finite number of parameter values k i n 0 such that k k. Finally, the model is unidentifiable at kif it is possible to construct an infinite number of coordinate transforEations, and hence to find an infinite number of parameter values k that satisfy eqs 7a-c. Due to the above result, the procedure for establishing identifiability properties of the model given by eqs 4a and 4b is very similar to the one we described for the analysis of first-order reaction systems* and easily applies to models of any size. Let f denote the vector formed by the entries of the matrix T and introduce the notation T = T(f). Apart from the nonsingularity condition (det (T(f)] # 0) the elements of f are a priori free. Since A(%) = Tf,the identifiability problem is reduced to the problem of solving the polynomial equations
-
-
T(f)
%'(E)
= xo(k)
T(f) P(%,k) = P(T(f)%,k)
(sa) (8b)
C(k) = C(k) T(f) (8c) for f. We recall that any such solution must satisfy eqs 8a-c at every point of an open neighborhood Vof the initial conditions foin Rn.Therefore, as we will show, eq 8b gives rise to a separate
Identifiability and Distinguishability
The Journal of Physical Chemistry, Vol. 98, No. 20, 1994 5267
equation for each individual term in the polynomials. The identifiability properties are determined by the number of solutions. There is always a trivial solutionJ'f such that T(P) = I, where I denotes the n X n identity matrix. Thus, the transfoJmation x = T(f'J)lretains the original state variables and yields k = k. If the only solution is the trivial solution P, then the reaction scheme is uniquely identifiable. The reaction scheme is identifiable, but not uniquely identifiable if eqs 8a-c have a finite number of nontrivial solutions for the vector f. In this case we have a finite number of similarity transformations that preserve the response function y(t,k) and also satisfy all known structural constraints. Finallv. the reaction scheme is unidentifiable if there exists an infinity df such transformations, and hence the same response function can be generated by in infinite number of different parameter values. Let us for a moment return to first-order reaction systems with rate expressions of the form P(x,k) = A(k)x, where A(k) is a matrix of first-order rate coefficients. In this case eqs 8a-c are reduced to
T(f) Wo(k) = xo(k)
(9a)
=
(9b)
T(f)
T(f)
C(k) = C(k) T(f)
(9c)
Equations9a-carethesameaseqs2.14aand2.14bofourprevious paper, and thus the method developed here is a straightforward extension of the state-space similarity transformation method for linear systemsa2J2 However, as shown in the following examples, the identifiability properties of first-order and higher order reaction systems substantially differ. Example 1. The first-order consecutive reaction scheme 1 is not uniquely identifiable if the response function is a linear combination of the form of eq 2 with coefficient t g being an unknown parameter. We replace the first step by a bimolecular one, leading to the reaction scheme A,+A2+B,
B d C
=f2
+ of4
(15)
a
Since may differ from 0,this equation does not further restrict the structure of T. Using the coordinate transformation x = T(f)%withf, = 1 and f3 = 0, it can be readily verified that two equations given by 8b take the form
cf2- l)k,Zi -f2k2Z2 = -kl(Rl +f 2 Z 2 ) 2
(16a)
f 4 ( k i Z i - E 9 2 ) = ki(Z1 +f2Z2)2-kJ4R2
(16b)
and
respectively. Equations 16a and 16b must hold for all kl and Zz in an open neighborhood of Zy and Z: = 0 in RZ. Since there is no ZlZ2 term on the left hand side of eq l6a, f z = 0, and eq 16a reduces to '1 = k l . 'Omparing the coefficients of 2: in eq !6b shows = By the ?2 terms$ kz = kZ* and by eq 159 @ = B. Thus, the only transformation that satisfies eqs 8a-c is T = I, and the reaction scheme is uniquely identifiable. Example2. We haveshown thatihe krst-order reaction scheme 3 is unidentifiable if the observable is given by eq 2 and t g is unknown.2 Here we show that the reaction scheme
'.
Al+A2+B,
B-C
is uniquely identifiable under similar conditions. As in example 1, the observable is given by eq 1 1 and tg is unknown. We assume that [A110 = [A2Io = xy. Then the kinetic equations are reduced to XI
= -k,xi
+ k-,x2
(17a)
and we consider the observable given by eq 13. Since eqs 8a and 8b are the same as in example l , f , = 1 a n d h = 0. With this transformation matrix T, the two equations of 8b take the form
(10)
and show that the parameters kl, k2, and t g are uniquely identifiable when the following function is observed:
= e A I I A 1 l + eA2[A21 + EBIBl + cC[cl
(ll)
where t ~€A,,~ and , tc are known coefficients. For simplicity we assume that the reactants A1 and A2 are present in stoichiometric amounts, [All0 = [A210 = xy. Then the kinetic equations are reduced to k, = -k,xf
(12a)
x2 = k , x i - k2x2
(12b)
where X I = [AI] = [Az] and x2 = [B]. The response function is of the form y = ax,
+ BX,
(13)
where a = tal + t~~- 2ec is a known constant and /3 tg - tc is an unknown parameter. Observing the response function y is equivalent to observing eq 11 because y = 0 - 2e& and t& is a known constant. Apriori all entries of the 2 X 2 transformation matrix T(f) are free:
T(f) =
f (i,
f2 f4>
(14)
Since the initial conditions are known, we can write lo = xo, where xo = ( x ~ , O and ) ~ t o = (Zy,O)T. Using these values, eq 8a implies that f1 = 1 and f3 = 0, and eq 8c is given by
respectively. Since there are Z1Z2 terms only on the right hand side of these equations, they can hold on an entire open neighborhood of 20 in RZ only iff2 = 0. Comparing the 2; terms in eq 18a then shows that kl = k l . From the 3 2 terms in the same equation we have E-,= k-& whereas the Z: terms in eq 18b show that f4 = 1, and hence k-l-= k-I. Finally, comparing the coefficients of 32 shows that kz = kz and the model is uniquely identifiable. Examples 1 and 2 show reaction schemes that are either unidentifiable or not uniquely identifiable if all reactions are first order but become uniquely identifiable if some of the elementary steps are replaced by bimolecular ones. Such behavior seems to be the rule rather than an exception. We have investigated a large variety of reactions systems, including models of combustion1°J4 and oscillatory p r o c e ~ s e s . ~In ~ Jmost ~ cases we assumed that the concentrations are measured for all stable sDecies but not for intermediates that are unstable or difficult to isolate. In spite of unobserved concentrations, unique identifiability has been established for all models under reasonable experimental conditions, Le., excluding only the cases in which the observables clearly do not provide information on some reaction steps. This result is surprising, because even much simpler systems of first-order
5268
The Journal of Physical Chemistry, Vol. 98, No. 20, 1994
reactions are frequently unidentifiable. Furthermore, it is not very difficult to construct unidentifiable and/or not uniquely identiable dynamical systems with polynomial or even quadratic nonlinearities.6.7 On the basis of this experience we tried to find unidentifiable or not unique identifiable schemes of higher order reactions but were unable to construct any meaningful example, unless unidentifiability originated in a subset of the scheme with only first-order reactions. Throughout this paper we assume that model 4 satisfies conditions for both local controllability and obser~ability.~J%~ The two conditions together guarantee that the model is a minimum representation; Le., it has the smallest number of state variables among all models generating the same output. Assuming both controllability and observability is critical for the proof of the identifiability theorem presented here.63' However, application of the theorem does not necessarily require establishing these properties for the particular system under study. If model 4 is not locally controllable or not locally observable at ZO,then there can exist a continuous family of transformation matrices T (Le., an infinity of different values for the parameters f i n T(f)) that satisfy the conditions of eqs 8a and 8b, yet the model is uniquely identifiable. Thus, without establishing controllability and observability of a model, the uniqueness of the solution T = I becomes a sufficient but not necessary condition for unique identifiability. Since we have found T = I to be the only solution for all reasonable kinetic models investigated so far, and this implies unique identifiability, we do not further discuss controllability and observability properties here. However, controllability and observability need to be established if there exists a nontrivial transformation satisfying eqs 8a and 8b. This can be done by performing well-defined, although somewhat tedious, algebraic manipulations.5.8 The concepts of controllability and observability can help to establish the notion of reasonable kinetic experiments, which is very difficult to define in a formal way. If a model is uncontrollable and/or unobservable, then (possibly following a state-space similarity transformation) some state variables can be omitted, since their presence does not affect the observable input-output behavior of the system. Since in the nonlinear case similarity transformations may be defined only on some regions of the state space, the ability to reduce the dimensionality of the model may also be restricted to certain regions. In the context of a reaction scheme, uncontrollability and unobservability mean that some parts of the reaction network are not reachable through any reaction step from the reactants with non-zero initial concentrations, or some species do not contribute at all to any of the observables. Such problems are not encountered in reasonable kinetic experiments. III. Deterministic Distinguishability Model distinguishability deals with the uniqueness of a model structure within a class of structures which compete for the description of the behavior of a system. If two models are indistinguishable in the deterministic sense, then through a suitable choice of parameters they yield the same behavior in all considered experiments. Thus, without the use of additional information (e.g., further observables) it is not possible to select a unique model structure. To study model distinguishability we consider model I described by eqs 4a and 4b and a competing model I1 given by
k ( t , k ) = P(Z(t,k),k) Z(0,k) = x o ( k )
(19a)
y(t,k) = C ( k )x(t,k)
(19b)
and
At first glance this pair of models does not seem to differ from the pair of models 4 and 5 we used for identifiability analysis in
Vajda and Rabitz the previous section. However, while eqs 4 and 5 share the same polynomial P and the same output matrix I , and_ only the parameter values differ, in eq 19 the structures of Pand C generally also differ from the structures of P and C,respectively. Similar to the analysis of identifiability we assume that models I and I1 satisfy the conditions for local controllability and local observability. This implies that both models are minimal representations. Thus, the two models have the same number of state variables, and the analysis of deterministic distinguishability is restricted to kinetic schemes with the same number of independent rate equations. Apart from this assumption, models I and I1 are two arbitrary descriptions of a process. Extending the concept of parameter indistinguishability to-two different models, parameter values k E ! l of model I and k E fi of model I1 are said to be indistinguishable ( k k ) if the corresponding response functions are equal, Le., y(t,k) = y(t,k) for all t E [to,t,]. Models I and I1 are defined as indistinguishable if for almost every parameter value k R Of model I there exists an indistinguishable parame@ val!e k E Q of model 11, and vice versa; thus for almost every k E R of model I1 there exists an indistinguishable parameter k E R of model I. In section I1 we used the local state isomorphism theorem and the related concept of system equivalence to formulate a condition for identifiability. The same ideas can be used for the analysis of model distinguishability. By the local state isomorphism theorem parameters k of model I and k of model I1 are indistinguishable in kinetic experiments defined by the initial conditions xo(k) and fO(k) for the two models, respectively, if and only if there exists a coordinate transformation x = Tf such that
-
T ( f )&%,E) = P(T(f)W,k)
c(k)= C ( k ) T ( f ) The above condition has been applied to a variety of kinetic models. We have found that two reaction schemes that consist of unimolecular and bimolecular reactions, and differ a t least in one step, are always indistinguishable from each other in a deterministic sense under reasonable experimental conditions, e.g., observing the concentrations of stable species but not of unstable intermediates. Since first-order reaction schemes are frequencly indistinguishable, this result is as surprising as the intrinsic identifiability of higher order reactions discussed in the previous section. The two results are, however, rather consistent. We have shown that identifiability and distinguishability are closely related.* In particular, the identifiable submodels of an unidentifiable model are indistinguishable from each other and from the original model.2 In the first-order case we used this property for exhaustive modeling2 by generating all reaction schemes that are indistinguishable from a given scheme. Since the presence of bimolecular reactions makes reaction schemes identifiable under reasonable experimental conditions, the existence of indistinguishable models is unlikely. IV. Practical Identifiability and Distinguishability
Deterministic unidentifiability of a model implies that its parameters cannot be determined in any experiment without introducing further observables, or reducing the number of unknown parameters. Deterministic unidentifiability is independent of the parameter values, the sampling rate, and the measurement accuracy. As shown in section 11, systems of predominantly bimolecular reactions with occasional unimolecular steps generally turn out to be identifiable in reasonable kinetic experiments. Nevertheless, it is well documented in the literature that estimating the rate constants of elementary steps by observing
The Journal of Physical Chemisrry, Vol. 98, No. 20, 1994 5269
Identifiability and Distinguishability a complex kinetic process may be hopeless, and the derived values are frequently at variance with the best experimental and theoretical values available. In view of the deterministic identifiability of kinetic models we have just established, this observation implies that the problem is due to practical unidentifiability. Thus, the actual ability of estimating the parameters from a set of kinetic data depends on the true values of the parameters, on the sampling rate, and on the statistics of the measurement noise. In this section we evaluate the effects of these factors. Consider a kinetic experiment in which the m-vector y(t,k) of observables is measured at time points rl, t2, ..., t,, resulting in the observations 31.32, ...,9,. These values relate to the response function y(t,k) of a kinetic model through
1:
0.2
004 00
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 0
30
k2
yi = y(t,k) + q
(21)
where eidenotes the m-vector of normally distributed observation errors with zero mean and covariance matrix u2Rr,and u2 is (possibly unknown) scalar factor. The least-square estimate k of the parameters minimizes the objective function
Figure 1. Value of Ami, as a function of kz in example 3.
The inequality
provides a lower boundlo on the mathematical expectation of the squared distance (@- 8>T(j3- where Amin denotes the smallest eigenvalue of the Hessian matrix Sr(O)R-lJ(j3). Since the matrix is symmetric, its eigenvalues are real. We will consider a model practically identifiable if Amin > 100u2,because otherwise E((@- 8))1 0.01 by eq 29. By the normalization 25 this indicates that we expect at most 10% errors in the parameter estimates. While the choice of the 10% cutoff is arbitrary, the criterion seems to be generally appropriate for avoiding parameter estimates with inflated absolutevalues, a feature of ill-conditioned estimation problems. Consider a model which is practically unidentifiable on the basis of the above condition. The eigenvectors corresponding to the smallest eigenvalues provide information on the sources of practical unidentifiability. We use the logarithmic transformation 25 in this analysis. The uncertainty region (eq 26) is an ellipsoid in the parameter space with principal axes along the eigenvectors of the Hessian matrix JT(a)R-lJ(a), and the length of the axis along the eigenvector ui is proportional to l/fi,, where A, is the corresponding eigenvalue. Assume that the eigenvector uminis of the form umin= (u, u/C2, ..., u/C,)r, where the C,'s represent coefficients that are relatively constant on an open neighborhood of the least-square estimate &. Due to the logarithmic transformation the presence of such an eigenvector indicates16J7that the objective function (eq 22) essentially depends only on the nonlinear parameter combinations kl/kF, ..., k ,/k?. In particular, the eigenvector u,h = (0.707,0.707,0,...,bTindicates that theobjective functiondependsonlyon theratio kllk2, whereas the eigenvalue u,in = (0.707, -0.707,0,...,0)'shows dependence only on the product klk2. Example 3. We studied the practical identifiability of the scheme 10 with observable 11 as described in example 1, We used the known coefficients EA, = EA^ = 1.0 and ec = 1.5. The unknown parameters of the problem were kl, k2, and j3 = EB - tc. The analysis was performed assuming the values kl = 1.0 and = 3.0, whereas kz was varied between k2 = 0.0 and k2 = 3.0. Using a sampling interval of 0.2 time units, for each value of kz we generated 60 points of the observable 0. No errors were added to the data, but in eq 29 we assumed an error variance e2 = 9 X 1 P that corresponds to approximately 1% measurement noise. Figure 1 shows the smallest eigenvalue Amin as a function of k2. With u2 = 9 X lo" the model is practically unidentifiable whenever Amin < 0.09. According to Figure 1, this is the case for all values of k2, apart from the interval 0.035 I kz I 0.32. On the basis of the form of the eigenvector uminrwe can define four regions I-IV on which the system exhibits different identifiability properties (Figure 1 and Table 1). On region I (k2 < 0.035) the model is practically unidentifiable due to the small
B),
If the model is uniquely identifiable and the observation is errorfree (Le., t i = 0), then Q(k) = 0 is the unique global minimum of Q. For sufficiently large samples the uniqueness of the global minimum is preserved if the error variance is reasonably small.10 However, in spite of unique deterministic identifiability there may exist parameter values k that significantly differ from the least-square estimate k but yield almost the same fit to the data. Any point k of the uncertainty region defined by
for some small e exhibits such unwanted behavior if the norm (k- k)T(k - k) is relatively large. Determining the shape of the uncertainty region is the key to the analysis of practical identifiability. We will show that for kinetic models it is useful to study the uncertainty region in the space of the transformed parameters a and j3 defined by aj = In kj
j = 1, . . . , p
(24)
Pi = k j / k j
j = 1,
...,p
(25)
and
respectively. We first consider the variables j3. The uncertainty region is defined by Q(j3) - Q(B) 5 e, and for small e can be approximated by the ellipsoid
and R-I is a block-diagonal matrix with blocks R-li, i = 1, ..., r. Notice that at the least-square estimate k = k the transformations 24 and 25 yield the same Jacobian matrix, because
In a practically identifiable model the least-squares estimate
B is expected to be closeto the true value of j3 of the parameters.
&'(a
5270 The Journal of Physical Chemistry, Vol. 98, No. 20, 1994
Vajda and Rabitz
TABLE 2 region I I1
N
2
111 IV
X
c
Eigenanalysis in Example 4 eigenvector umin k2 k-1
B
k-1 (midpoint)
Amin
0.75 1.05 1.70
0.001 36
kl -0.11
0.00244
-0.21
0.57
0.78
0.000 24 0.001 46
-0.11
0.98 0.53
-0.04
0.01 0.13 0.15
0.02
0.63
3 -00
-0.56
0.23
0.96
r:
Figure 2. Value of
Amin
3
2
1
0
k- 1 as a function of k-1 in example 4.
TABLE 1: Eigenanalysis in Example 3 eigenvector umin region
k2 (midpoint)
Xmin
ki
I I1
0.02
0.052 0.197
-0.25
111
0.10 0.75
IV
2.30
0.047 0.033
0.66 0.64
-0.02
k2 0.95
-0.58 -0.21 0.71
B 0.17
-0.43 -0.74 0.70
value of k2. Region I1 (0.035 I k2 I0.32) is the interval of practical identifiability. On region III (0.32 Ik2 I1.30) only the parameter combination kl@is determinable, in addition to k2 (see Table 1). As k2 is further increased, on region IV we can determine k l and k2/@. Notice that by eq 12a x1 depends only on k l . The differential equation can even be solved in closed form and yields xl(t,k) = x ~ / ( x ~ k , t 1). The behavior on region IV can be explained in terms of the quasi-steady-state approximation x2 = 0 which becomes acceptable for increasing values of k2. From the QSS assumption, x2 = k l / k 2 x ; . Thus, the second term in eq 13 decays much faster than the first, and hence the estimate of kl is relatively well defined. The second term in eq 13 becomes @kl/k2x:,and since kl can be found separately, the objective function depends only on the ratio @/k2, in good agreement with the form of the eigenvector uminon region IV. Example 4. Here we study the practical identifiability of the scheme described in example 2. The observable is the same as in example 3. In order to eliminate the problems already discussed, we set k l = 1.0 and k2 = 0.1, since in this case the model is uniquely identifiable at k-l 0 (see example 3). The value of k q was varied from k-l = 0.0 to k-1 = 3.0. For each k-l we generated 60 error-free data points of the observable given by eq 11. As in example 3, the condition for practical identifiability is Ami, 1 0.09, However, in this case the value of Amin is substantially (almost 2 orders of magnitude) smaller than the required lower bound at any k-I, and hence the model is unidentifiable, independently of the true value of the parameters. Unlike deterministic unidentifiability, this conclusion is not absolute; i.e., it depends on the magnitude of the error variance and on the design of the experiments. In fact, the sampling interval Af = 0.2 is too large to estimate some of the parameters. As shown in Figure 2, the maximum of Ami,, is attained at k-l = 1.O. The use of the sampling interval of Af = 0.05, and hence 240 data points instead of the original 60, raises the maximum of Ami,, to 0.06 1, and thus almost makes the model practically identifiable. As in example 3, the form of eigenvectors corresponding to Ami, provides some information on the sources of practical unidentifiability (Table 2). If kl < 1.0, then the objective function is rather insensitive to this parameter (region I in Figure 2 ) . Increasing k-1 causes its effect to become comparable to that of kZ, and Ami, attains its maximum (region 11). On region I11 the objective function is not sensitive to the parameter k2. Further increasing k-l makes the QSS assumption become applicable,
+
and the response function depends only on the combination p k l l k2 of the parameters. As shown in the above examples and more directly in our previous work,16J7 practical unidentifiability of the parameters in a kinetic model frequently relates to the validity of a quasisteady-state ( Q S S ) assumption on the concentration of one or several intermediates. A QSS assumption of the form xi = 0 reduces the kinetic differential equation for xi into an algebraic equation that can be solved for xi. Substituting these solutions into the rest of the model reveals that the resulting equations do not necessarily depend on all the original parameters. In fact, in most cases the QSS equations include only some (generally nonlinear) combinations of the parameters such as ratios or products. Thus, if QSSA is a good approximation, then only these nonlinear combinations are identifiable, and the problem of identifying all parameters is inherently ill-posed. Conversely, the result of principal component analysis, in particular the form of eigenvectors corresponding to small eigenvalues, frequently suggests strong nonlinear dependences among the parameters that can be explained by one or more QSS assumptions. Example 5. Since the reaction scheme in example 4 is practically unidentifiable when the sampling interval of At = 0.2 is used, it is expected that thedata generated under theseconditions can be well described by simpler models, including the one in example 3. To test this assumption, we used the values kl = 1.O, k2 = 0.1, k-1 = 1.05, and CB = 3.0 that give the maximum Ami,, for the four-parameter model. Assuming k-l= 0.0 and adjusting the values of the other parameters, we found the minimum of the least-squares objective function at kl = 0.42 f 0.024, k2 = 1.25 f 0.045, and EB = 5.00 f 0.24. At these parameter values the mean residual errors is 0.024. Using error-free data, there is some nonrandomness in the residuals, but the obtained mean residual error of 0.024 is compatible with the assumed 1% error of the observations. The largest residual error is 2.46%. Thus, in spite of the large differences in the parameter values, the two reaction schemes are practically indistinguishable. However, increasing the sampling rate or reducing the measurement noise can clearly reverse this conclusion. In this paper we discussed only simple reaction schemes that included irreversible steps, and all unknown parameters were regarded as apriori free. However, detailed mechanisms of kinetic processes are often assumed to include only reversible steps in order to satisfy the thermodynamic equilibrium condition known as the principle of microscopic reversibility. This condition yields one or more equality constraints among the rate coefficients, and thus some parameters can be expressed as functions of the others. As the number of parameters is reduced, the condition of microscopic reversibility improves both deterministic and practical identifiability properties but is frequently unable to remove inherent practical unidentifiability. V. Conclusions As shown in our previous paper,2 following a reaction by observing some quantities such as absorbance instead of the concentrations of all species, we found that neither the rate coefficents nor the reaction scheme itself is necessarily unique for first-order kinetic processes. Using a deterministic analysis, we have shown that nonuniqueness is frequently present, even assuming an idealized experiment in which the observable quantities are known continuously and error-free. In this paper
Identifiability and Distinguishability we extended the deterministic analysis to systems including higher order reactions with mass-action-type rate equations. To our surprise, in all examples studied we have found that the Occurrence of nonlinearity in the rate equations eliminates unidentifiability and/or indistinguishability of kinetic models in reasonable experiments. Thus, networks of first-order and second-order reactions seem to exhibit an interesting "holographic" property: the entire model can be uniquely reconstructed from the concentrations of stable species or related response functions, provided that these functions are known continuously and errorfree. It is emphasized that this result is only a conjecture, supported by the analysis of a variety of problems, and by our failure to construct even a single unidentifiable kinetic model for a reasonable kinetic experiment. While the proofofthisconjecture for a relatively general class of kinetic models would be a theoretical challenge, from a practical point of view the analysis of deterministic identifiability is more important for linear than for nonlinear kinetic models. Although detailed mechanisms of processes including mainly bimolecular reaction steps seems to be generally identifiable in the above deterministic sense, the presence of measurement noise and limitations on the sampling rates substantially limit the practical ability of reconstructing the model or its parameters. Because the problem is numerical rather than deterministic, the well-known difficulty of deriving rate constants for elementary steps from kinetic data on complex reactions can be controlled up to a certain degree by experimental design and high measurement accuracy. However, estimation with any reasonable accuracy frequently requires unrealizable experimental conditions (Le., too high measurement accuracy or extremely dense sampling). Therefore, practical unidentifiability may be a
The Journal of Physical Chemistry, Vol. 98, No. 20, 1994 5271 limitation as inherent as the deterministic identifiability we previously described for a number of first-order reaction systems.
Acknowledgment. H.R.acknowledges support from the Department of Energy.
References and Notes (1) Walter, E.;Lecourtier, J.; Happel, J.; Kao, J. Y. AIChE J . 1986,32, 1360-66. (2) Vajda, S.;Rabitz, H. J . Phys. Chem. 1988,92,701-07. (3) Bauer, S.H. Int. J. Chem. Kinet. 1990,22, 113-33. (4) Isidori, A. Nonlinear Control Systems: An Introduction; Springer: New York, 1982. (5) Tunali, E. T.; Tarn, T. J. IEEE Trans.Autom. Conrroll987,AC-32, 146-54. (6) Vajda, S.;Rabitz, H. IEEE Trans. Autom. Control 1989,AC-34, 220-23. (7) Vajda, S.;Godfrey, K.R.; Rabitz, H. Math. Eiosci. 1989,93,21748. (8) Chappell, M.;Godfrey, K. R.; Vajda, S. Math. Eiosci. 1990,102, 41-73. 19) Edelson. D.: Allara. D. L. AICHE J. 1986. 32. 1360-66. (10) Vajda,S.; Rabitz,H:; Walter, E.;Lecourtier,J. Chem.Eng. Commun. 1989,83,191-219. (1 1) Vajda, S.Math. Eiosci. 1981,155, 39-64. (1 2) Walter, E. Identijiability of State-Space Models; Springer: New York, 1982. (13) Hermann, R.; Krener, A. J. IEEE Trans. Autom. Control 1977, AC-22, 728-40. (14) See contributions to the following collections: (a) Chemistry of Combustion Reactions; Gardiner, W. C., Jr., Ed.; Springer: Berlin, 1983. (b) Complex Chemical Reaction Systems: Mathematical Modeling and Simu; Berlin, 1987. lation; Warnatz, J., Jager, W., as.Springer: (1 5) See contributions to: Oscillations and Travelling Wavesin Chemical Systems; Field, R. J., Burger, M.,Eds.;Wiley: New York, 1985. (16) Vajda, S.;Turanyi, T. J. Phys. Chem. 1986,90,1664-70. (17) Vajda, S.;Valko, P.; Turanyi, T. Int. J . Chem. Kinet. 1985,17,5581.