Entropy, Information, and the Arrow of Time - ACS Publications

We shall investigate the relationships between the thermodynamic entropy and information theory and the implications that can be drawn for the arrow o...
0 downloads 0 Views 59KB Size
16184

J. Phys. Chem. B 2010, 114, 16184–16188

Entropy, Information, and the Arrow of Time† Irwin Oppenheim* Chemistry Department, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States ReceiVed: July 22, 2010; ReVised Manuscript ReceiVed: September 9, 2010

We shall investigate the relationships between the thermodynamic entropy and information theory and the implications that can be drawn for the arrow of time. This demands a careful study of classical thermodynamics and a review of its fundamental concepts. The statistical mechanical properties of time-dependent systems will be carefully studied, and the point at which the arrow of time appears will be described. 1. Introduction It is frequently true that, when scientists of different disciplines have common interests, new and significant insights are obtained. An important example is the theory of transport properties in fluids that had been treated by hydrodynamicists using macroscopic equations1 and by statistical mechanics using molecular techniques.2 The resulting interactions between these groups led to mode coupling treatments by the molecular theorists that were confirmed by the computer simulations of Alder and Wainwright.3 The necessity for mode coupling led to new treatments of standard problems and to important advances in the understanding of static and dynamic critical behavior,4 long time tails,5 nonanalytic density dependence of transport coefficients,6 and the ability to treat the dynamical behavior of particle (sand) flow7 and polymers.8 It is also true that workers in different fields can be misled by the fact that fundamental equations in one field appear to be similar to those equations in another field. This can lead to faulty conclusions and assumptions and has occurred in the fields of information theory, on the one hand, and statistical mechanics and thermodyamics on the other hand. The confusion arises because the expressions for information and entropy look the same. These expressions are given by9

S ) -kTr(F(N) ln[F(N)N!h3N])

(1)

where S is the entropy, k is Boltzmann’s constant, and in classical systems, F(N) is the total distribution function of the system. The trace operation includes an integration over all of the positions and momenta of all particles in the system and a sum over the total number of particles for the grand canonical ensemble for open systems. This expression is the Gibbs entropy postulate and is the most general exact expression for the entropy in equilibrium systems. In quantum mechanics, F is the density matrix for the total system, and the factors N!h3N do not appear.10 The Shannon expression for information looks the same as this expression, namely11

I ) -K

∑ Ρi ln Ρi i



Part of the “Robert A. Alberty Festschrift”. * To whom correspondence should be addressed.

(2)

where Ρi is the probability of the occurrence of state i. However, these expressions have very different physical content, and conclusions drawn on the properties of entropy from information theory ideas can be misleading and frequently incorrect. We shall focus our attention on this situation. Unfortunately, the word “entropy” has become fashionable and has been used in misleading ways in a wide variety of disciplines,12 including the social sciences, economics, chaos theory, and astrophysics. Discussions of the entropy of the universe or of black holes are completely unjustified since these systems are not in equilibrium nor can they be brought into equilibrium by adding constraints. One of the few constructive uses of the term was by the economist Paul Samuelson,13 who derived valuable economic insights from it. It should be noted that Samuelson was a student of Edwin Bidwell Wilson, who was a student of J. Willard Gibbs. In this paper we shall discuss in some detail the properties of entropy in thermodynamics and statistical mechanics and the subtleties associated with that concept. We will then contrast those properties with the assumptions used in information theory. 2. Thermodynamics, Measurement Theory, and Equilibrium The salient property of thermodynamics is that it is a macroscopic discipline and is not concerned with the properties of individual atoms or molecules.14 Thermodynamic properties such as energy density, number density, pressure, and temperature are measured by macroscopic measuring devices that almost always destroy phase relationships between molecular properties. The description of macroscopic measurements has been described in detail in many papers by van Kampen.15 Quantum mechanics is really not involved in these measurements. The thermodynamic description applies to macroscopic systems only. An essential concept in the discussion of the entropy is thermodynamic equilibrium. An equilibrium state has timeindependent properties and no fluxes either within the system or between the system and its surroundings. In addition, the state has to be approachable via a wide variety of paths. Thus, for example, a glass is not an equilibrium state because, although it can be approached from high temperatures, it cannot be approached from lower temperature equilibrium states. Equilibrium states are always defined by the constraints placed on the system. By changing the constraints, we can proceed from one equilibrium state to another. Again, these

10.1021/jp106846b  2010 American Chemical Society Published on Web 10/20/2010

Entropy, Information, and the Arrow of Time

J. Phys. Chem. B, Vol. 114, No. 49, 2010 16185

constraints have to be macroscopic and cannot restrict the behavior of specific individual molecules. As an example of how the change in constraints can change the properties of the system, we consider the following. In onephase systems, in the absence of external fields the properties are uniform throughout. The constraint is that the system is isolated. In two-phase systems, the properties are uniform within each phase and in equilibrium, the temperatures, pressures, and chemical potentials of the phases are equal in equilibrium. If the constraints are changed from the isolation of the overall system to the isolation of each phase, the temperatures, pressures, and chemical potentials of the two phases may be different in equilibrium. A criterion for equilibrium is:

(δS)E < 0

(3)

where S is the entropy, E is the internal energy, and δ signifies all possible infinitesimal changes consistent with the constraints imposed on the system. Therefore, the entropy is a maximum at equilibrium. Entropy is not measured directly but from other properties of the system. Along a reversible path

dS )

dQrev T

The conclusions of thermodynamics apply to macroscopic systems only. A system with small numbers of particles will not obey the laws of thermodynamics, especially the second law. Also, the measurements must be macroscopic. Consider, for example, a two-component solid mixture. In thermodynamics, we deal with the gross properties of the systems and not the details of how the particles are placed on the lattice. We describe ordered and disordered systems but do not deal with a particular disordered system. It is not possible to talk about the entropy of a specified random array of particles. There is no way of computing the entropy of such a system. We can, of course, compute the entropy of a macroscopically disordered system in which all possible arrangements of the particles are considered. It is essential to distinguish between the processes that lead the system from one state to another. These states may be equilibrium states (E) or nonequilibrium states (N). The processes may be reversible (R) or irreversible (I) with special mention of quasi-static (q) processes. A reversible process is quasi-static and proceeds through a path of equilibrium states. The system and its surroundings can be brought back to their initial states along a path infinitesimally different from the forward path. The reversible process starts from one equilibrium state, proceeds through a series of equilibrium states, and ends in an equilibrium state.

(4) E {\} E′

(6)

R

where dQrev is the heat absorbed along the path; for an irreversible path

dS >

dQirrev T

(5)

We remember that S is an extensive function of state, and the change of entropy between any two equilibrium states can be determined from thermodynamics no matter what the process is. If the process is irreversible and the states are not equilibrium states, we cannot determine the entropy from thermodynamic measurements. Unfortunately, glasses fall into this category, and the assignment of a thermodynamic entropy to a glass is not justified. To emphasize these points, we recall the following. An equilibrium state is time-independent, has no currents, and is attainable in a variety of directions. Thus, for example, glasses are not equilibrium states. Equilibrium states are determined by external constraints that again are macroscopic. Macroscopic properties are measured by macroscopic measuring devices, and subtleties of phases are smoothed over. The equilibrium states mentioned are stable equilibrium states or, perhaps at special points, neutral. Unstable equilibria do not exist because of fluctuations. The mathematical statements of the second law of thermodynamics are obtained from the analysis of the physical statements promulgated by Kelvin, the uncompensated conversion of heat into work does not occur in natural processes, and Clausius, all natural processes are irreversible. Here is where the apparent conflict between the mechanical equations of motion that are reversible and the second law occur. It is an apparent conflict between the microscopic equations of motion and the equations describing the time dependence of the macroscopic properties of the system.

It is postulated in thermodynamics that any equilibrium state of a system can reach any other equilibrium state by a reversible process. The difference in entropy between E and E′ can be determined either by using eq 4 or by using the fact that the entropy is a state variable, and differences between the entropies of E and E′ are independent of the process. An equilibrium state can go to a nonequilibrium state or vice versa by an irreversible process. Thus,

E 98 N

(7)

I

N 98 E I

and

N 98 N1 I

Here the entropy difference cannot be determined either from the heat absorbed in the process or by using the fact that S is a state variable. The care that must be taken in determining whether a particular state is an equilibrium state is illustrated by the Gibbs ink blot experiment16 and the spin echo experiments of Hahn.17 In the ink blot experiment a blot of ink is perturbed by a stirring device that spreads it slowly over a fluid. One would think that it is obvious that the entropy of the extended drop is larger than the original drop. But, if the stirring motion is reversed after a short time, the system will return to its initial state except for some blurring due to the diffusion. Aside from this blurring, no irreversibility is involved since the intermediate state retains the memory of the initial state except for the diffusive process involved. In the spin echo system, the initial state of the spins is oriented along an external field. When the spins are subjected to a spin

16186

J. Phys. Chem. B, Vol. 114, No. 49, 2010

Oppenheim

flip (90°), they go to a system of zero spin that appears to be an equilibrium system in the absence of the external field. However, if the spins are flipped again (180°), they return essentially to the initial state except for a small seeping of information into the lattice. Again we return to the initial state because the pseudoequilibrium state is produced by inhomogeneities in the system, and that process is reversed by the spin flip. In both cases, we appear to go to intermediate states that are substantially different from the initial state. However, these intermediate states retain memory of the initial state and are certainly not equilibrium states. For small spin systems, detailed calculations have been performed by Waugh.18 The differences between large and small systems are described, and the so-called equilibrium systems are investigated. Our conclusion so far is that we can obtain expressions and values for entropy in equilibrium systems only. Perhaps it is possible to expand the situations in which the entropy can be determined. The other conclusions are that the concept of entropy is not useful for small systems and that mistakes can be made in ascribing values of entropy to pseudoequilibrium systems. Molecular measurements on systems do not lead unambiguously to entropy calculations. The smoothing of data as a result of the macroscopic measuring devices does not allow us to specify the molecular states of the system and does not allow us to retrace our steps on a molecular basis. While we can proceed from a final equilibrium state to an initial equilibrium state, we cannot recover the detailed molecular properties of the initial state. This does not imply that the tendency for a system to proceed from a nonequilibrium state to an equilibrium state depends on the observer. This tendency is a property of the system, but it can be affected by observations on the system. 3. Statistical Mechanics In the last section, we implied that the calculation of entropy of systems is reliable only for macroscopic equilibrium systems. In this section, we will discuss additional insights obtained from molecular statistical mechanics studies. There are several molecular expressions for the entropy, all of which have some drawbacks. The most general of these expressions is the so-called Gibbs entropy postulate, which is given by:

S ) -kTr(F(N) ln(F(N))N!h3N)

(8)

In classical mechanics, F(N) is the distribution function of the N particle system, and the trace operation implies an integration over the phase space of the N particle system and a sum over N (for grand canonical ensembles). In quantum mechanics, the expression becomes

S ) -kTr(F(N) ln F(N))

(9)

where F(N) is the density matrix for the N particle system, and the sum over N is included when appropriate. All of the usual formulas for equilibrium thermodynamics can be obtained from these expressions. While these expressions are valid for equilibrium systems, they are not valid for nonequilibrium systems. This is because the time derivatives of S(t) from eqs 8 and 9 for a classical or quantum system are 0; that is,

S˙(t) ) 0

(10)

using the classical or quantum dynamic equations. Jaynes and his colleagues19 have successfully used information theory to obtain explicit forms for the distribution function in terms of thermodynamic quantities. They attempted to extend these techniques to nonequilibrium systems but were unsuccessful. At this stage of the game, there is no conflict between the reversible molecular dynamical equations and the expression for entropy since no irreversible processes are considered. The “exact” expression for entropy, eq 8, is completely consistent with the molecular equations but not with the second law of thermodynamics. We have to be able to generalize eq 8 to include nonequilibrium systems and irreversible processes. This will necessitate the introduction of approximations and will rationalize the existence of irreversible processes. The exact equations of motion for the distribution functions or for the densities of macroscopic properties all have the property of being time reversal invariant just like the dynamical equations derived from Newton’s laws. At this point, there is no conflict between these equations and no apparent arrow of time. These equations can be cast in generalized Langevin and Fokker-Planck equation20,21 forms and are particularly useful when the quantities involved contain small numbers of particles. The dynamical variables are usually sums of one- and twoparticle terms and the reduced distribution functions, F(q/N), for q , N depend explicitly on small numbers of particles. When the Langevin type equations are approximated using orders of magnitude of the terms and differences in time scales, the results are no longer time reversible, and there is an apparent arrow of time. The same result is also obtained from computer simulations since the calculations are never exact. These approximations are valid only for large systems.22 It is not chaos or ergodic behavior that introduces the arrow of time but the fact that the system eigenvalues go from a discrete to an almost continuous set as the number of particles increases. The authors of the book by Vulpiani et al. have made a very careful study of which phenomena lead to dissipative behavior. An explicit calculation for a special oscillator linearly interacting with a large number of oscillators is easily performed. As the number of bath oscillators increases, there is a transition from oscillating to decaying behavior for the special oscillator.23 It is because of the smoothing introduced by the measuring devices that the correction terms are eliminated. Thus, the analytic results, the computer simulations, and macroscopic measurements are consistent with each other. It is possible to be more explicit about the time dependences of reduced distribution functions and dynamical variables. If an isolated system starts in a nonequilibrium state, neither its distribution functions nor its dynamical variables relax to their equilibrium forms. In principle, we would expect that the N particle distribution functions always retain the memory of their initial forms. In fact, one can show that24

F(N)(t) f F(N) eq (1 + 0(1))

(11)

F(q/N)(t) f F(q/N) eq (1 + 0(q/N))

(12)

tf∞

whereas

tf∞

Entropy, Information, and the Arrow of Time

J. Phys. Chem. B, Vol. 114, No. 49, 2010 16187

where F(q/N) is the reduced q particle function in a system containing N particles. Van Kampen’s objection25 to linear response theory is correct for total distribution functions but not for reduced distribution functions. The basis of his objection is that any external perturbation of the N particle system changes the trajectories of all of the particles, and no linear theory can describe this effect except for very short times. However, when the phase space of most particles is integrated over, we are looking at the average effect of the perturbation, and the perturbation can be described linearly. We should mention that special initial conditions lead to situations in which there is no apparent decay of the reduced quantities in the system.26 There is nothing esoteric in the macroscopic time dependences that thermodynamics predicts. It is not necessary to consider extremely strange arguments to justify the second law.27 The macroscopic nature of the measurement process and the macroscopic size of the system suffice. There are at least two situations that lead to valid approximations to eq 8. One of these is for a low-density gas where Boltzmann has replaced eq 8 by

S ) -kTr(F(1/N) ln F(1/N))

(13)

where F(1/N) is the singlet distribution function in a system of N particles. This quantity does have the property that

S˙(t) g 0

(14)

where the equality is for equilibrium systems and the greater than sign applies to irreversible processes from the initial nonequilibrium state to the final equilibrium state. Unfortunately, there is no way to derive this result rigorously, even for low densities. Another more general situation occurs when the time and space dependences of the macroscopic densities (number density, momentum density, and energy density) are small in the sense that they change slowly in time compared to molecular time and vary slowly in space compared to molecular dimensions. In this case, the initial state of the system is approximately in local equilibrium and can be considered to be in equilibrium when the constraints applied do not allow fluxes of number, momentum, or energy between finite, but small regions of space. In this case, we assume that the local entropy density, s(r, t), has the same dependence on the local number density, n(r, t), energy density, e(r, t), and momentum density, p(r, t), as it does in equilibrium.28 This is clearly an approximation, and the equations for the time derivatives of n, e, and p are also approximated. These approximations lead to equations compatible with the second law and with classical hydrodynamics, including mode-coupling modifications.29 4. Glasses We have mentioned before that it is difficult to imagine how the concept of entropy can be used to elucidate the properties of glasses. Much work in this field was stimulated by the Kauzmann paradox30 in which it seems that, as Tf0, a disordered system has a lower entropy than the ordered equilibrium system. Kauzmann used an extrapolation of heat capacity from the liquid, through the supercooled liquid, to the glass. However,

since the supercooled liquid is not in equilibrium, it is difficult to justify the extrapolation, and there is no reason to think that the entropy he computes has any validity or leads to a paradox. Another attempt to rationalize the use of entropy to describe the properties of a glass is due to Kivelson and Reiss,31 among others. They assume that since the particles in a glass are more or less fixed in position that the entropy at the glass transition has a finite drop since the translational motion is frozen. This is a dangerous hypothesis as is discussed experimentally by Johari,32 who has also studied the Kauzmann paradox. His conclusions are echoed by Goldstein33 from the theoretical point of view. One way of justifying this idea would be to introduce constraints that fix the position of each particle. This is inconsistent with the fact that the constraints have to be macroscopic in nature and cannot be applied to individual molecules. 5. Conclusion For many years a number of esoteric discussions about the arrow of time circulated among statistical mechanics, ergodic theorists, chaos theorists, and information theorists. The Brussels school was active in these discussions, which led to the application of super operators in quantum mechanics and statistical mechanics.34 There were hints that the fundamental dynamical equations in classical and quantum mechanics had to be modified to reproduce experimental results. There were also investigators who claimed that the interaction between the system of interest and its surroundings were responsible because of the flow of information between them.35 Few people emphasized the essential contributions to the arrow of time, namely, the large number of particles in the system and the fact that thermodynamic measurements are on a macroscopic scale. Among these are van Kampen15 and Vulpiani22 and colleagues, who start in very different places but end with the same conclusions. References and Notes (1) Lorentz, H. A. Theory of Electrons; Teubner: Germany, 1909. (2) Martin, P.; Kadanoff, L. Phys. ReV. 1961, 124, 670. (3) Alder, B. J.; Wainwright, T. E. J. Chem. Phys. 1958, 31, 459. (4) Kawasaki, K. Ann. Phys. 1970, 61, 1. (5) Schofield, J.; Lim, R.; Oppenheim, I. Physica A 1992, 181, 89. (6) Kawasaki, K.; Oppenheim, I. Phys. ReV. 1965, 139, A1763. (7) Schofield, J.; Oppenheim, I. Physica A 1993, 196, 209. (8) Shea, J.-E.; Oppenheim, I. Physica A 1998, 250, 265. (9) Gibbs, J. W. The Collected Works of J. Willard Gibbs, Vol. I; Longmans, Green and Co.: New York, 1928. (10) von Neumann, J. Mathematische Grundlagen der Quantenmechanik; Springer-Verlag: Berlin, 1932. (11) Shannon, C. L.; Weaver, W. The Mathematical Theory of Communication; University of Illinois Press: Urbana, IL, 1949. (12) Greven, A.; Keller, G.; Warnecke, G., Eds. Entropy; Princeton University Press: Princeton, NJ, 2003. Schulman, L. S. Time’s Arrow and Quantum Measurement; Cambridge University Press: Cambridge, U.K., 1997. (13) Samuelson, P. Economics: An Introductory Foundation of Economic Analysis; Harvard University Press: Cambridge, MA, 1947. (14) Beattie, J. A.; Oppenheim, I. Principles of Thermodynamics; Elsevier: Amsterdam, 1979. (15) van Kampen, N. G. Physica A 1988, 153, 97. (16) Gibbs, J. W. The Collected Works of J. Willard Gibbs, Vol. 2; Longmans, Green, and Co.: New York, 1928. (17) Hahn, E. Phys. ReV. 1950, 80, 580. (18) Waugh, J. S. Mol. Phys. 1998, 95, 731. (19) Jaynes, E. T.; Rosenkrantz, R. D., Eds. E. T. Jaynes: Papers on Probability, Statistics, and Statistical Physics; Reidel: Netherlands, 1983. (20) Zwanzig, R. J. Chem. Phys. 1960, 33, 1338. (21) Mori, H. Prog. Theor. Phys. 1965, 33, 423. (22) Castiglione, M.; Falcioni, M.; Lesne, A.; Vulpiani, A. Chaos and Coarse Graining in Statistical Mechanics; Cambridge University Press: New York, 2008. (23) van Kampen, N. G. Dan. Mat. Fys. Med. 1951, 26, 15. Ullersma, P. Physica 1966, 32, 56, 74.

16188

J. Phys. Chem. B, Vol. 114, No. 49, 2010

(24) Ronis, D.; Oppenheim, I. Physica A 1977, 86, 475. Oppenheim, I. Prog. Theor. Phys. 1990, Supplement 99, 364. (25) van Kampen, N. G. Phys. NorV. 1971, 5, 3. (26) Suarez, A.; Silbey, R.; Oppenheim, I. J. Chem. Phys. 1992, 97, 5101. (27) Maccone, L. PRL 2009, 103, 070401. (28) de Groot, S. R.; Mazur, P. Nonequilibrium Thermodynamics; North Holland: Amsterdam, 1962. (29) Oppenheim, I.; Levine, R. D. Physica A 1979, 99A, 383. Machta, J.; Oppenheim, I. Physica 1982, 112A, 361. Kavassalis, T. A.; Oppenheim, I. Physica A 1988, 148, 521.

Oppenheim (30) Kauzmann, W. Chem. ReV. 1948, 43, 219. (31) Kivelson, D.; Reiss, H. J. Phys. Chem. B 1999, 103, 8337. (32) Johari, G. P. Thermochem. Acta 2010, 500, 111. (33) Goldstein, M. J. Chem. Phys. 2008, 128, 154510. (34) Prigogine, I. From Being to Becoming: Time and Complexity in the Physical Sciences; W. H. Freeman: San Francisco, 1980. (35) Lebowitz, J. L.; Bergmann, P. G. Phys. ReV. 1955, 99, 578.

JP106846B