7988
J . Phys. Chem. 1993,97, 1988-1992
Mass-Coupled Chemical Systems with Computational Properties A. Hjelmfelt and J. Ross’ Department of Chemistry, Stanford University, Stanford, California 94305 Received: April 28, 1993
We simulate a pattern recognition device constructed from mass-coupled chemical reactors. Each reactor is an open system containing a bistable (iodate-arsenous acid) reaction. We store arbitrary patterns of high and low iodide concentrations in the network using a Hebb type rule. When the network is initialized with a pattern which is similar to a stored pattern, errors in the initial pattern are corrected and the similar stored pattern is recalled. When the network is initialized with a pattern which is not similar to a stored pattern, the network evolves to a homogeneous state which signals nonrecognition. The network is a programmable parallel computational device.
1. Introduction All living beings must process information to one extent or another using a hardware that is primarily chemical, be it by macroscopic dissipative processes such as biochemical switches or by molecular properties such as conformations of a protein. Computations using simple chemical networks are of interest because they provide a link between more complicated biological networks and electronicor mechanical networks. Neural network theory,I4 originally inspired by biological neural networks, can be used for electronic computations. It is worth considering whether neural networks can be implemented chemically, and if so with which chemical networks. In this work, we continue”’ our development of a pattern recognition network based on a chemical hardware.* Neural networks are highly interconnected assembliesof simple elements, called neurons, that are usually two-state devices. The state of each neuron is communicated to the other neurons and serves as part of the input to the other neurons. A neuron in the network performs a weighted sum of all the states of the neurons which serve as the input, and when the weighted sum passes a certain threshold the state of the neuron changes. This model of neuronal activity is termed the McCulloch-Pitts neuron model: which follows from physiological studies that show that activity of a neuron can be approximated as being either “low” or “high” depending on the amount of stimulation. Due to the interconnectivity, networks of these simple elements can show complex behavior: multiple steady states,lOJ hysteresis,’z oscillation^,^^ chaos,l4 or formal ~ndecidability.’~ Neural networks serve as models of cooperativephenomena and collective computation.13J5-17 The computational processes performed by neural networks are usually parallel, and thus they provide a framework for developing massive parallel architectures. Neural networks are also robust in the presence of noise or damage to the networks; they fail “gracefully”, whereas serial (normal) computations fail catastrophically. Although a variety of different types of computations can be performed by neural networks, we focus on a computation termed “associative memory” or “categ~rization”.~~~ In this type of computation, a certain number of patterns are stored in memory, to be recalled later. During operation patterns are presented to the network, and it signals if this pattern has been stored previously. Ideally, patterns should be recognized if they contain sufficiently small numbers of errors. In this case, the network recalls the entire error-free pattern from the flawed input. We usea network where stored information is contained in the connections among the neurons and is thus spread throughout the network. This is radically different from a standard computer, which localizes the data with one register per data item, for example. Physiological
evidence tends to imply that memory is delocalized in the brain. Whether it is stored in the neuronal connectivities is as yet unproven. In previous work, we discussed an implementation of serial and parallel neural networks using a homogeneous chemical reaction In this work, we continue the development of a network using a chemical kinetic network consisting of reaction mechanisms with multiple stationary states coupled by mass transfer.8 Wedescribe the implementationof the network and assess its properties as a programmable computational device. 2. Mass-Coupled Chemical Systems As a single element in the network, we use a bistable chemical system, described, for simplicity,by a single variable xi. Consider a reaction such as the iodate-arsenous reaction run in a CSTR (continuouslystirred tank reactor) which showsbistability between states of low and high iodide concentrations for a range of flow rates and reactant concentrations. Under experimentalconditions with arsenous acid in stoichiometric excess, the net reaction is18
IO;
+ 3H3As03+ I- + 3H3As0,
(1)
Theintermediatesteps of this net reactioncontainan autocatalysis in iodide, which provides the nonlinearity in the mechanism and the resultant bistability. For a single CSTR, in which the only input into the CSTR is the reactant flow, the system is described by a single differential equation18J9 dx dt = -k,x3
+ [kB(XO+ Yo)- k A ] x 2+ .
where k A and k~ are effective rate constants, k is the reactant flow rate, x is the concentration of I-, Xo is the inflow concentration of I-, and YOis the inflow concentration of IO3-. The values used in the calculations are given in Table I. In the derivation of eq 2 from a two-variable system, a conservation constraint is used, x + IO3- = XO+ this constraint is valid when initial startup transients have decayed in a single or a network of mass-coupled reactors.21 The couplingof chemical system by mass transport,22-25 electric currents,26 a computer interface?’ or chemical28means has been studied. We couple the CSTRs by mass transfer. If the mass transfer is diffusional, then it is reciprocal; if the mass transfer is by active transport, then we require it to be reciprocal. For a collection of “ass-coupled CSTRs each described by a single 0 1993 American Chemical Society
Mass-Coupled Chemical Systems
The Journal of Physical Chemistry,
TABLE I: Parameters Used in the Numerical Calculations Obtained in Part from Ref 20. parameter
value 0.21 M--I s-' 21 OOO M-* s-I 2.0 x 10-5 M 7.1 X l(r M 7.18 X s-l
ka
xoksYo
k a Under these conditions, [I-] = 0.000 13 M and 0.000 58 M in the two possible stable steady states and 0.OOO357 M in the unstable steady state for an isolated CSTR with kinetics described by eq 2.
variable, the time evolution of the ith tank is given by2' dx,
N
CSTRs are in the state xf, and the other half are in the other state, $", Since the multiplier of (XI - xi) appears linearly in the summation over j , we can approximate it with its average value for the two cases
($-xi)(O[-l
(4)
where e(x) = x if x 1 0 and e(x) = 0 if x < 0. CSTRs which are in opposite states in the majority of the stored patterns are entirely uncoupled, and CSTRs which are in the same state in the majority of the stored patterns are coupled by a strength which increases with the number of stored patterns that have the CSTRs in the same state. This rule stores both a pattern and its negative: for example, the pattern Ri = 1, R, = 1 results in the same k, as its negative Ri = 0, R, = 0. To show that this Hebbian rule generates stable states correspondingto the stored patterns, we perform an approximation on eqs 3 and 4. Inserting the Hebbian rule into eq 3, we obtain
We assume that the network is initialized with a pattern which resembles the stored pattern p'; that is, almost all xj = $'. The stored patterns are assumed to be random, so roughly N/2 of the
+ x(2Rf-l)(2q-l)])]
(6)
P#P'
The averages (denoted ( )) can be computed by combinatorics. The probability that Ri = RI in n patterns out of P - 1 (we already know whether Rf' = Ry) is given by the binomial distribution
- = AXi) + "Cki,(X,
- Xi) (3) dt j#i wherefix,) contains the rates of reactions and the flows between the CSTRs and the reservoirs-it is given by the right-hand side of eq 2 for the iodate-arsenous acid reaction. If the transport is by active pumping, then we suppose that we may neglect any delays due to the volume of the tubes connecting the CSTRs. A network of this type has only fixed point attractors.21 If an uncoupled CSTR is monostable mxi) has a unique positive root), the network of mass-coupled CSTRs also has one steady state: each CSTR has the samevalue of x, andfix) = 0. Thus, different patterns cannot be stored in such a network. If the uncoupled CSTR is bistable cf(x) has three positive roots), then the coupled network may have 2Ncoexistingstable stationary solutions. Thus, such a network can store patterns provided that a rule for the storagecan be found. Bistability of an isolated CSTR is essential for the functioning of the network; monostability, which is typically used in neural networks, is insufficient. The patterns which are stored in the network are determined by the connectivity of the network. Suppose we wish to store a number of patterns in the network. Diffusion always acts to drive connected CSTRs to homogeneity. Thus, CSTRs which should be in the same state in all of the stored patterns should be coupled most strongly, whereas CSTRs which are always in different states should be entirely uncoupled. Those that are in the same state in some of the stored patterns and in opposite states in others should becoupled withsomeintermediatestrength. This idea is embodied in the so-called "Hebbian rule"29 used extensively in neural network theory. Let Rf be the activity of the ith CSTR in the pattern p; Ri = 1 for the high [I-] state and 0 for the low [I-] state. We take the connection strength between CSTRs i and j to be
YO/.97, NO. 30, 1993 7989
(:-
l)$T
(7)
The value of the summation when Rt = Rj in n patterns is (2n 1) because we add n X 1 and (P - 1 - n) X (-1). The argument of the 6's in eq 6 is either this value plus or minus one, and the function 8 converts all the negative values of its argument to zero, so
-P
+
If P is odd, then the index, n, of the sums runs from the next integer larger than P / 2 . The difference of the two averages is
because the summation in eq 10 is just one-half of the total probability in the binomial distribution. Thus, eq 6 can now be written as -=flxi)+$($-xi)(l dxi x +a(P))+($-xi)cy(P)] (11) dt where a(P) is the value of eq 9. The relative magnitude of the two terms in the square brackets depends on a ( P ) and, as a ( P ) grows monotonically from 4 2 ) = 0, is 0.5 at P = 5 and about 2 when P = 40. When a ( P ) is sufficiently small and.X is chosen sufficiently large, diffusion overcomes the restoring force of the reaction when xi = %'and drives xi to $', Thus, the stationary state corresponding top'is stable, and patterns with a few errors relative top'are within its basin of attraction. If X is chosen too large, diffusion drives the entire network to homogeneity, which violates the assumption that xj = $' (j # i) used in deriving eq 11.
3. Results Our computer calculation simulate the following experiment. Given a set of patterns to be stored, we calculate the flow rates between the CSTRs using eq 4. If, for example, three patterns are stored, then there are three possible values of kifi on the average, one-half the connections are of strength 0, three-eighths are of strength 1/N, and one-eighth are of strength 3/N. All CSTRs in the network have the same reactant concentration in the feed reservoirs and flow rates (excluding the flows due to the coupling). To start the experiment, the pumping between the
Hjelmfelt and Ross
7990 The Journal of Physical Chemistry, Vol. 97, No. 30, 1993 10 8
80 -
0'
a 1 6
E .I-
60
-
4 V
.-C
2
w
A=
40-
0 0
V
0 7
6
11
' k 2' 26 System number
31
20-
36
Figure 1. Example time series obtained from a chemical network of 36 coupled bistable CSTRs with three stored patterns. The x-axis gives the CSTR number, and the y-axis gives the time. The shading indicates the I-concentration, with the stable steady states of an isolated CSTR being pure black or white. The initial state at t = 0 is a pattern with 10 errors relative to the stored pattern. The steady state of the network has 2 errors relative to the pattern p'.
CSTRs is set to zero. The states of the uncoupled CSTRs are perturbed and allowed to relax to the steady state representing the presented pattern; only then is the pumping between the CSTRs started, and the network relaxes to the final state. The calculations test the ability of the network to map initial patterns with a certain number of errors relative to a stored pattern onto that stored pattem. We use network sizes of 13-72 CSTRs. A network of 13 CSTRs represents a possible experimental implementation of the theory, and 72 provide an example of the limiting behavior for large numbers of CSTRs. The variables under control of the experimentalist are the gain (A), the number of distinct stored patterns, and the presented pattern. As presented patterns, we use patterns with a certain number of errors relative to a stored pattern. Stored patterns are generated randomly, and one of them is designated p'. A certain number of randomly chosen pixels is reversed (Le., a CSTR which should be in the high iodide state is set to the low iodide state or vice versa) in the pattem p'to generate the presented pattern with that number of errors. After initializing the iodide concentrations in the CSTRs tothosein thepresentedpattern, theflowratesbetween theCSTRs are set to those dictated by eq 4. The network then evolves to a steady state, and this steady state is compared with the steady states of the stored patterns. We use 100 different sets of stored patterns and initial patterns to calculate statistics of recognition and error correction. A state is termed a steady state when the sum of the squares of the derivatives is less than M/s, and the sum of the squares of the derivatives is usually about 10-15 M/s at start. In Figure 1, we show a typical temporal evolutionof the solution of eqs 2,3, and 4 for a network with 36 CSTRs and three stored patterns; the presented pattern has 10 errors relative top'. The concentration of I- is indicated by the shading and is shown as a function of time for each CSTR in the network. The total experimental time corresponding to the simulation is about 5 h. The recalled pattern has two errors relative top': CSTRs 10and 32. The network corrects the error in CSTR 9 by causing it to go from the low to the high iodide state, and the network corrects errors in CSTRs 2, 16, 20, 22,24, 27, and 34 by causing them to go from the high to the low iodide state. CSTRs 10 and 32 start and end in the low iodide state, but they should be in the high iodide state in p'. Although CSTRs 10 and 32 have iodide concentrations higher than the others in the low iodide state, the iodideconcentration is much lower than that of the unstable state of an uncoupled CSTR, and thus they are in error. To assess the ability of the network to recognize presented patterns, we classify the steady states of the network in one of three categories. If the final state is not homogeneous, we find
v 0
0
I
0
5
0.001 5 0.0019 I
I
I
15
20
25
I
10
30
35
Initial errors
Figure 2. Percentage of final patterns which are not homogeneous and most closely related to p' (class 1) against the numbtr of errors in the initial pattern, relative to p', for various values of the gain.
o 0.00051 v 0.001 2 r)
v, v,
60 -
0 V
.-C w
40 -
20 -
0 0
v
5
13
15
w
20
w
25
30
35
Initial errors
Figure 3. Percentage of final patterns which are homogeneous (class 3) against the number of errors in the initial pattern relative top'for various values of the gain.
the stored pattern to which the final state is most related. The minimum number of CSTRs in the steady state of the network which must be reversed to convert the steady state into a stored pattern determines to which stored pattern the steady state is related. The first class is those patterns related top'; the second class is those patterns related to other patterns. The third class is the homogeneoussteady state which indicates that the network failed to recognize the pattern. To examine the average number of patterns found by the network in each of these classes, we store three random patterns in the network of 72 CSTRs and initialize the network with a pattern with 2-34 errors relative top'. If an initial pattern has more than 36 errors relative to a stored pattem, it is attracted to the negative of the stored pattern. In Figure 2, we plot the percentage of final patterns in class 1 against the number of initial errors, relative top', for various values of the gain, A. For less than 10 initial errors, virtually all final states are in this class. As the number of initial errors increases beyond 10, fewer initial patterns result in the recall of patterns in class 1, and for large gains and 28 or more errors, few patterns are in class 1. In Figure 3, we plot the percentage of final patterns which are found to be homogeneous (class 3). For large numbers of initial errors and for large gains, the homogeneous final state is the most likely final state, which implies that the network has not recognized these patterns. In Figure4, we plot the percentageof final patterns
The Journal of Physical Chemistry, Vol. 97, No. 30, 1993 7991
Mass-Coupled Chemical Systems 100
1
I
I
I
I
I
I
I
100
A= 0.00051 0 0.00084 v 0.0012 v 0.0015 0 0.0019 0
80
60 4
I
Q
1 q
8
,
I
o 6 errors v 14 errors o 22 errors
v
4t
1
i
P
3
2
D o e 0
1
0
Initial errors
Figure 4. Percentage of final patterns which are not homogeneous and most closely related to stored patterns other thanp’(c1ass 2) against the number of errors in the initial pattern, relative top’, for various values of the gain.
-0 .-C Y-
u-
0 I
e, I)
E
3 C Q)
ol 0 L e,
2 0
5
10
15
20
25
30
Initial errors
Figure 5. Average number of errors in the recalled patterns in class 1 (Figure 2) against the number of initial errors for various values of the gain. which are in class 2. For all but the smallest X, the other patterns are rarely recalled. These three figures show that the basins of attraction of the stored patterns, although substantial, are small compared to that of homogeneity. If patterns are chosen randomly, then less than 5% have 25 or fewer errors relative to one of the stored patterns or their negatives. From Figure 3, most of the other 95% with 25 or more errors will result in homogeneous final states for reasonably large gains. Thus, the basins of attraction of the stored patterns are surrounded by large areas of unrecognizable patterns. To show that the network is correcting errors in the initial patterns, we plot the average number of errors in the recalled patterns in class 1 (Figure 2) in Figure 5 for various values of the gain. For the lowest gain, X = 0.000 51, the initial and final number of errors are almost identical. The coupling is too weak to destabilize the initial patterns. For larger gains, the patterns are recalled almost perfectly for initial number of errors less than 14. For example, with X = 0.0012,99% of the patterns with 10 initial errors result in the recall of the first stored pattern (Figure 2), and of these recalled patterns the average number of errors is 0.17 or an improvement of a factor of 50. In addition to these averages, we can also examine the distribution of errors. In Figure 6, we plot the number of final
0
3
6
9
12
15
Figure 6. Number of recalled patterns in class 1 against the number of errors relative top’in those patterns for initial patterns with 6, 14, and 22 errors and X = 0.0015.
patterns in class 1 against the number of errors in those patterns. For six initial errors, only two of the 99 recalled patterns in class 1 have errors: one pattern with one error and one pattern with two errors are recalled. For 14 initial errors, 94 final patterns are in class 1 and 84 are perfect recalls. Most others have one to four errors, but one recalled pattern has 12 errors. For 22 initial errors, 55 final patterns are in class 1 and 44 are perfect recalls. No recalled patterns have one or two errors, but there is a fairly even distribution in the number of recalled patterns with 3-14 errors. Very rarely do we find a case where a final pattern has more errors than the initial pattern. The network recalls perfectly in most cases if a pattern is recognized. Smaller networks also possess pattern recognition abilities, albeit to a lesser extent than larger networks. As an example of what is experimentally feasible, we cite the work of Laplante and Erneaux.22 In a recent article, they discuss a coupled network of 16 bistable CSTRs with the possibility of 32 connections. With these capabilities in mind, we search for a network which may be implemented with this hardware. From eq 4, a network with two stored patterns only hasone-quarter of the possibleconnections as nonzero. Thus, a networkof 13 CSTRs with two stored patterns has an average of 13 X 12/4 = 39 connections. If we restrict the stored patterns to having no more that seven CSTRs in one state, then the network has on the average about 33 connections, and from these we use only sets of stored patterns that require no more than 32 connections. The results are summarized in Table 11. For all but the smallest A, patterns in class 1 are always recalled with very few errors relative to p’ if there is only one error in the initial pattern. For two initial errors, patterns in class 1 are recalled 94% of the time for the larger X with an average of about 0.35 errors, an improvement of a factor of 5. These last simulations indicate that experiments could be performed to verify the theory presented here in its simplest case: a small network with two stored patterns. A network with two stored patterns is considerably less complex than a network with three stored patterns: only CSTRs which are in the same state in both stored patterns are connected, which occurs only 25% of the time. In a network with three stored patterns, CSTRs which are in the same state in two or more stored patterns are connected, which occurs 50% of the time. Thus, in a network with three stored patterns, two CSTRs which are in opposite states in one of the stored patterns may be connected; this does not occur in a network with two stored patterns. 4. Discussion
In the chemical network presented here, each individual unit has the simple property of bistability. The connections between the CSTRs may be simple diffusion. The network itself also
7992 The Journal of Physical Chemistry, Vol. 97, No. 30, 1993
TABLE 11: Results for a 13-CSTR Network with a Maximum of 32 COMW~~ODS~ initial
errors
homogeneous patterns (9%)
second Dattern (9%)
first Dattern (9%)
final error
~~~
2 3 4 5
0 0 0 0 0
1 2 3 4 5
0 0 0 3 8
1
0 0
1
2 3 4 5
1
6 11
X = 0.00076 0
0 5 22 43 X = 0.0023 0 6 21 41 42 X = 0.0038 0 6 20 39 40
100 100 95 78 57
0.59 1.34 2.30 3.29 4.45
100 94 79 56 50
0.04 0.37 0.84 1.79 3.36
100 94 79 55 49
0.06 0.34 0.84 1.66 3.25
a The two stored patterns have 7 CSTRs in one state and 6 CSTRs in the other state.
TABLE IIk Comparison of Hopfield and Chemical Networks' Hopfield network
chemical network
Neurons electrical amplifiers with regular CSTRs and inverted output monostable bistable Connections mass transfer wires Connection Strength pumping rates resist ors strengths 1 0 strengths have any sign Storage Rule Hebbian Hebbian the pattern and its negative the pattern and its negative are stored are stored Steady States patterns are steady states patterns are steady states homogeneous states are infrequent homogeneous states are frequent a
For the electrical implementation of a Hopfield network, see ref 10.
appears simple; it only has multiple steady states. Nonetheless, computational properties emerge due to our ability to program thesesteady states using theHebbian rule. Theuseof the Hebbian rule for pattern storage is suggested by the Hopfield neural network theory. Our chemical network has many similarities but also differences with the Hopfield networks, and these are summarized in Table 111. From the experimental viewpoint, neural networks and this chemical network have desirable properties. They operate well
Hjelmfelt and Ross under adverse conditions such as the presence of imperfections in the individual neurons or connections. Simulations of "diluted" networks indicate that some of the connections may be severed without a catastrophic loss of computational power. Networks of this type also function well in the presence of noise. Indeed, a small level of noise often improves the performance. The performance of the chemical network does not depend strongly on the model parameters, such as the width of the hysteresisloop, etc. Likewise, we have not attempted to optimize the network parameters to any great extent. The components of our chemical neural network exist in biological networks: there are many biochemical reactions that are bistable, and bistability has been found in the response of neurons.lZ Mass transfer among biological compartments is ubiquitous. These are the necessary components for building this network, and they are at least available in living systems.
Acknowledgment. This work was initiated while A.H. held an Alexander von Humboldt fellowship with Prof. F. W. Schneider at the Univ. of Wiirzburg (Germany). This work was supported in part by the National Science Foundation. References and Notes ( 1) Herz, J.; Krogh, A,; Palmer, R. Introduction to the Theory of Neural Computation; Addison-Wesley: Reading, MA, 1991. (2) Hopfield, J . J. PNAS-USA 1982, 79, 2554. (3) Palmer, R. In Lectures in the Sciences of Complexity 1 ; Addison-Wesley: Redwood City, CA, 1989. (4) Rumelhart, D.;McClelland, J. Parallel Distributed Processing 1 & 2; MIT Press: Cambridge, 1986. (5) Hjelmfelt, A.; Weinberger, E. D.;Ross, J. PNAS-USA 1991, 88, 10983. (6) Hjelmfelt, A,; Weinberger, E. D.;Ross, J. PNAS-USA 1992, 89, 383. (7) Hjelmfelt, A.; Ross, J. PNAS-USA 1992, 89, 388. (8) Hjelmfelt, A.; Schneider, F. W.; Ross, J. Science 1993, 260, 335. (9) McCulloch, W. W.; Pitts, W. Bull. Math. Blophys. 1943, 5, 115. (10) Hopfield, J. J. PNAS-USA 1984, 81, 3089. (11) Tank, D.;Hopfield, J. J. Sci. Amer. 1987, Dec., 257, 62. (12) Wang, L.; Ross, J. PNAS-USA 1989,87, 988. (13) Sompolinsky, H. Phys. Today 1988 Dec., 41, 70. (14) Wang, L.; Ross, J. Phys. Rev. A 1991,44, R2259. (15) Minsky, M. Computation: Finite and Infinite Machines; Prentice-Hall: Englewood Cliffs, NJ, 1967. (16) Hopfield, J. J.; Tank, D.Science 1986, 233, 625. (17) Little, W. Math. Biosci. 1974, 19, 101. (18) De Kepper, P.; Epstein, I. R.; Kustin, K. J . Am. Chem. Soc. 1981, 103,6121. (19) Ganapathisubramanian, N.; Showalter, K. J . Chem. Phys. 1986,84, 5427. (20) Pifer, T.; Ganapathisubramanian, N.; Showalter, K. J. Chem. Phys. 1985, 83, 1101. (21) Hunt, K. L. C.; Kottalam, J.; Hatlee, M. D.;Ross, J. J. Chem. Phys. 1992, 96, 7019. (22) Laplante, J.-P.; Erneux, T. J . Phys. Chem. 1992, 96, 4931. (23) Stuchl, I.; Marek, M. J . Chem. Phys. 1982, 77, 1607. (24) Stuchl, I.; Marek, M. J . Chem. Phys. 1982, 77, 2956. (25) Kosek, J.; Marek, M. J. Phys. Chem. 1993,97, 120. (26) Crowley, M. F.; Field, R. J. J . Phys. Chem. 1986, 90,1907. (27) Dolnik, M.; Epstein, I. R. J. Chem. Phys. 1993, 98, 1149. (28) Alamgir, M.; Epstein, I. R.; J. Am. Chem. Soc. 1983, 105, 2500. (29) Hebb, D.The Organization of Behavior; Wiley: New York, 1949.