J. Phys. Chem. A 2002, 106, 5063-5071
5063
ARTICLES The Chemical Langevin and Fokker-Planck Equations for the Reversible Isomerization Reaction† Daniel T. Gillespie‡ Dan T. Gillespie Consulting, 812 West Vicki AVenue, Ridgecrest, California 93555 ReceiVed: July 26, 2001; In Final Form: December 17, 2001
This paper uses the simple reversible isomerization reaction to illustrate and clarify the roles played in chemical kinetics by recently proposed forms for the chemical Langevin equation and chemical Fokker-Planck equation. It is shown that the stationary solution of the chemical Fokker-Planck equation for this model reaction provides, for most purposes, an excellent approximation to the stationary solution of the chemical master equation. It is also shown that, when allowance is made for the stipulated macroscopic nature of the time increment dt in the chemical Langevin equation, the changes in molecular population during dt predicted by that equation for this model reaction closely approximate the changes prescribed by the chemical master equation. The discussion highlights the role of the chemical Langevin equation as not only a potential computational aid but also a conceptual bridge between the stochastic chemical master equation and the traditional deterministic reaction rate equation.
1. Introduction When molecules of a well-stirred mixture of N molecular species {S1, ..., SN} interact through M chemical reaction channels {R1, ..., RM}, the molecular population vector X(t) ≡ (X1(t), ..., XN(t)), where
Xi(t) ≡ the number of Si molecules in the system at time t (i ) 1, ..., N) (1) changes stochastically because of the inherent randomness of molecular collisions. If the molecules are confined to a fixed volume and kept at constant temperature, straightforward kinetic theory arguments show that for each reaction channel Rj there is a function aj such that1
aj(x) dt ≡ the probability, given X(t) ) x, that one Rj reaction will occur in the system in the next infinitesimal time interval [t,t+dt) (j ) 1, ..., M) (2) This propensity function aj, together with the state-change Vector νj ≡ (νj1, ..., νjN) as defined by
νji ≡ the change in the number of Si molecules produced by one Rj reaction (j ) 1, ..., M; i ) 1, ..., N) (3) completely characterizes reaction channel Rj. So, for example, if Rj is the reaction S1 + S2 f 2S1, then νj ) (+1, -1, 0, ..., 0) and aj(x) ) cjx1x2 where cj in this case is the conventional reaction rate constant kj divided by the volume of the system. † Part of this work was performed while the author was a Staff Scientist in the Research Department of the Naval Air Warfare Center in China Lake, California. ‡ E-mail:
[email protected].
Using only eqs 2 and 3 and the laws of probability theory, one can prove that the probability P(x,t|x0,t0), that X(t) will equal x given X(t0) ) x0 for t g t0, obeys the chemical master equation (CME):1,2 M
∂
P(x,t|x0,t0) )
∂t
[aj(x - νj) P(x - νj,t|x0,t0) ∑ j)1 aj(x) P(x,t|x0,t0)] (4)
Equations 2-4 imply that the system’s state point X(t) performs a “random walk” on the integer lattice in the N-dimensional species population space; in mathematical terms, X(t) is a jump MarkoV process. But if the molecular population levels happen to be so large that the granularity of the integer lattice is not noticeable, the randomness in the trajectory of X(t) is often also not noticeable. In that case, the trajectory takes on the character of a continuous, deterministic process which is described by the set of ordinary differential equations
dXi(t) dt
M
)
νjiaj(X(t)) ∑ j)1
(i ) 1, ..., N)
(5)
This is the well-known reaction rate equation (RRE) of traditional chemical kinetics, although expressed in terms of molecular populations instead of concentrations. For well-stirred systems the CME (4) has a firm microphysical basis,1 so for such systems it describes accurately the effects of molecular level randomness. In contrast, the RRE (5), which also requires the system to be well-stirred, is a more phenomenological equation; yet we know from experience that it describes most macroscale chemical systems quite well. Just how the CME (4) gets supplanted by the RRE (5) as a chemical
10.1021/jp0128832 CCC: $22.00 © 2002 American Chemical Society Published on Web 04/25/2002
5064 J. Phys. Chem. A, Vol. 106, No. 20, 2002 system approaches the “thermodynamic limit” of infinite molecular populations has been the subject of much study and considerable debate for several decades. Recently this issue has become more than merely academic: Biochemists are finding that, inside a living cell, the relatively small molecular population levels of some key reactant enzymes can sometimes cause molecular level randomness to have a dramatic impact on cellular development.3 In an attempt to articulate more clearly the relation between the CME (4) and the RRE (5), this writer recently presented in ref 4 arguments showing that, under certain specific conditions, the jump Markov process defined by the CME (4) can be decently approximated by a continuous Markov process that satisfies the following chemical LangeVin equation (CLE):4 M
Xi(t+ dht) ) Xi(t) +
νjiaj(X(t)) dht + ∑ j)1
Gillespie RRE (5) in differential form, replacing the left side by [Xi(t+dt) - Xi(t)]/dt and then multiplying through by dt, we should really use some kind of “macroscopic” dt. Notice that if we used dht in that differential form, we would obtain the CLE (6) except for the last summation term therein. As was discussed in ref 4, eq 6 is but one of several different candidates for “the” chemical Langevin equation that have been proposed in the prior literature. The main contribution of ref 4 was to show that eq 6 has the distinction of being rigorously deriVable from the same premise (2) that underlies the CME (4), by making specific approximations that should be valid whenever conditions i and ii hold. It is known in continuous Markov process theory that every LangeVin equation for a process X(t) is accompanied by a unique Fokker-Planck equation for the probability density function P(x,t|x0,t0) of that process. The Fokker-Planck equation corresponding to the specific Langevin eq 6 turns out to be4
M
νjiaj1/2(X(t)) Nj(t) (dht)1/2 ∑ j)1
(i ) 1, ..., N) (6)
N
∂
P(x,t|x0,t0) ) -
∂t Here, N1(t), ..., NM(t) are M statistically independent, temporally uncorrelated normal (or Gaussian) random variables with means 0 and variances 1; and dh t is a positiVe macroscopically infinitesimal time increment, which will be defined more precisely in a moment. The CLE (6) evidently tells us how, if we know the state of the system at time t, we can compute the state at the slightly later time t + dht; in principle, this is all we need to trace the time evolution of the system.5 The definition of the macroscopic infinitesimal dht in eq 6 is important, because it defines the special circumstances under which that approximate equation is valid: The key requirement is that dht be (i) small enough that none of the propensity functions aj changes in a macroscopically noticeable way during dht, yet (ii) large enough that each reaction channel Rj fires many more times than once during dht. Only to the extent that the system admits a dht satisfying both of these conditions will the CLE (6) decently approximate the time evolution of the process X(t) defined by the CME (4). In cases where it is not possible to find a dht that satisfies both conditions i and ii, the CLE (6) will not be a reliable approximation to the CME (4). This notion of a macroscopically infinitesimal time increment is not at all new in physics or chemistry. For example, the definition of electrical current as the ratio dQ ÷ dt, where dQ is the charge passing in infinitesimal time dt, is meaningful only if dt is a macroscopic infinitesimal; because, if dt were allowed to be arbitrarily close to zero, as for a “true” infinitesimal, we would eventually observe “shot noise” as charge passes by in discrete chunks (on electrons): the ratio dQ ÷ dt would not approach a well-defined limiting value. So, in conventional electrical circuit theory, it is always tacitly understood that the dt in the ratio dQ/dt is large enough that very many electrons pass by in time dt. But this “macroscopic” character of the infinitesimal dt in electrical circuit theory is rarely called out in a notationally explicit way, as we have done in eq 6; indeed, eq 6 appears in ref 4, where it was derived, without the overbar on the “d.” The reason for the notational emphasis in this paper will become clear later. Even the derivative in the RRE (5) presumes a macroscopic dt: The change [Xi(t+dt) - Xi(t)] in the number of Si molecules between times t and t + dt approaches zero with dt not continuously but rather through discrete values (and likewise for the concentration of Si), a behavior that is really not allowed in a differentiable function. Therefore, if we were to write the
N
∑ 2 i)1
∂2 ∂2
∑ i,i′)1∂x ∂x i2. But that is not the way eq 7 is deduced in ref 4: In ref 4, eq 7 is inferred directly from eq 6 by appealing to some general results in continuous Markov process theory [see, e.g.: Gillespie, D. T. Am. J. Phys. 1996, 64, 1246-1257], and eq 6 in turn is shown to be a direct approximate consequence of the fundamental premise (2) whenever conditions i and ii are satisfied. (7) Zwanzig, R. J. Phys. Chem. B 2001, 105, 6472-6473. (8) In ref 7, the distribution (A1), which reduces to eq 11, was incorrectly identified as a Poisson distribution. The canonical forms of the binomial and Poisson distributions are given in the table on p 929 of the Handbook of Mathematical Functions (Abramowitz, M.; Stegun, I. A.; National Bureau of Standards: Washington, DC, 1964), which also shows the binomial mean and variance formulas (12). It turns out that the binomial distribution (11a) becomes a Poisson distribution in the dual limit xT f ∞, q f 0, with xTq a finite constant. With the definition (11b), this Poissonian limit is xT f ∞, c2/c1 f 0, with xTc2/c1 a finite constant. But we have no need to invoke that special limit here.
J. Phys. Chem. A, Vol. 106, No. 20, 2002 5071 (9) The assertion in ref 7 (near the end of its section 1) that “because of the linearity of the particular rate equation used here, Gaussian noise leads to a Gaussian equilibrium distribution” is incorrect. The rate equation in question is a LangeVin equation, which depends not only on a drift function A but also a diffusion function D. When A is linear and D is a constant, the stationary distribution will indeed be normal (a well-known example being the Ornstein-Uhlenbeck process), and our eq 13 shows that this is the case when c1 ) c2. But if both A and D are linear, as eq 13 shows is the case when c1 * c2, the stationary distribution will not be normal. The reason is essentially that, if D(x) ∝ x and X(t) happens to be normal, then the diffusion term D1/2(X(t)) N(t) dt1/2 in the Langevin equation will contain the product of a normal with the square root of a normal, which is not a normal; therefore, X(t+dt) will not be normal. (10) By definition, Θ(z) ) 1 if z > 0 and 0 otherwise. So, if the random variable U(0,1) takes a value between 0 and a2(x) dt, which will happen with probability a2(x) dt, the first Θ function in eq 19b will be 1 and the second will vanish; hence, with probability a2(x) dt the expression in eq 19b will be +1, in agreement with eq 19a. If the random variable U(0,1) takes instead a value between 1 - a1(x) dt and 1, which will happen with probability a1(x) dt, the second Θ function in eq 19b will be 1 and the first will vanish; so, with probability a1(x) dt the expression in eq 19b will be -1, again in agreement with eq 19a. If neither of these two things happen, both Θ functions in eq 19b will vanish, and we have the third eventuality in eq 19a. (11) The stochastic simulation algorithm (SSA) is described in: Gillespie, D. T. J. Comput. Phys. 1976, 22, 403-434; J. Phys. Chem. 1977, 81, 23402361. However, in ref 7 a different simulation procedure was described (though not actually carried out). In that procedure, one chooses a time step ∆t, draws a sample value r of the unit-interval uniform random variable U(0,1), and then effects in the next ∆t an R2 reaction if r < a2(x) ∆t, or an R1 reaction if r > 1 - a1(x) ∆t, or no reaction otherwise. This is a conceptually straightforward implementation of the fundamental premise (2), but it has the practical drawback that it is approximate; it becomes exact only in the limit that ∆t becomes infinitesimally small, in which limit the procedure becomes infinitely slow and infinitely consuming of random numbers r. The SSA by contrast is exact with respect to premise (2), and hence also with respect to the CME (4). It requires two random numbers for each reaction eVent; one of those random numbers determines the time to the next reaction event, and the other determines the identity (index) of that reaction. The SSA does not require one to choose a time step size, nor does it entail approximating an infinitesimal time interval dt by a finite time interval ∆t. The main limitation of the SSA derives from the fact that it does dutifully simulate eVery reaction event that occurs in the system: If the molecular population level of any reactant species happens to be so large that an enormous number of reaction events actually occur per unit of real time, the progress of the SSA in real time will be extremely slow. Of course, this limitation also applies to the other simulation algorithm. (12) For an extensive discussion of the Langevin and related approximation strategies for accelerating the stochastic simulation algorithm, see: Gillespie, D. T. J. Chem. Phys. 2001, 115, 1716-1733. Work in this area is ongoing. (13) This linear scaling of the propensity functions with the molecular populations actually obtains for all types of reaction in the thermodynamic limit, wherein the molecular populations Xi and the system volume Ω approach infinity in such a way that the concentrations Xi/Ω remain finite: The propensity function aj of an mth order reaction will contain m molecular population factors, but its reaction probability rate constant cj will contain a factor Ω-(m-1) (see ref 1), which effectively “cancels” all but one of the population factors in the thermodynamic limit. The CLE (6) thus implies quite generally the well known rule-of-thumb that “fluctuations scale like the square root of the molecular population.”