Ind. Eng. Chem. Res. 2000, 39, 1713-1722
1713
Complex Domain Chemical Process Simulation in Theory and in Practice† Angelo Lucia‡ Department of Chemical Engineering, University of Rhode Island, Kingston, Rhode Island 02881-0805
This paper contains a review of chemical process simulation in the complex domain. The current theoretical framework and computational results for both steady-state and dynamical simulation are surveyed. In addition, issues related to process feasibility and infeasibility are discussed as are the advantages and disadvantages of complex domain process simulation. Numerical applications present in the literature are surveyed and developmental needs for the next generation of complex domain process simulators are suggested. 1. Introduction Equation-solving tools (i.e., methods for solving nonlinear algebraic equations and optimization problems as well as numerical integration techniques) are often referred to as the “engines” of process simulation and optimization software. With rapid advances in hardware, new approaches to software development and use (like open architectures) and the fact that current tools for equation-solving do not represent a fully mature technology, algorithms are still being developed and refined. In this paper I present review material to give the reader a perspective of the current state of the art and future directions of chemical process simulation in the complex domain. I open the paper with a short chronology of the development of complex domain process simulation and then present theoretical results, discuss advantages and disadvantages, present a number of results from a variety of examples, and finally close with a discussion of the future of complex domain process simulation. In 1989 my students and I began research in complex domain process simulation as an outgrowth of my interest in equation-solving, chaos, and problems with multiple solutions. If you look through some of my prior work, as well as the work of others, you will see reference to “roots or solutions disappearing” in studies of dew point calculations (see p 648 in Lucia et al.1) and azeotropic distillations involving heterogeneous mixtures. Of course, real-valued roots or solutions do not simply disappear; they bifurcate into the complex domain. In hindsight it seems rather surprising that modeling and simulating chemical processes in the complex domain had not been tried before. After all, those of us in process engineering have all taken and/ or taught courses that involve complex domain mathematics. We all learned to use the quadratic formula in high school, and we take elementary physics, introductory electrical engineering, and process control as undergraduates. Some of us even teach process control. Furthermore, complex mathematics has been around a very long time and complex iterative methods date back to Cayley2 and before. † With many thanks to Professor J. D. “Bob” Seader for his constant encouragement. ‡ Tel.: (401) 874-2814. Fax: (401) 874-4689. E-mail: lucia@ egr.uri.edu.
A Brief Chronology. We began research in complex domain process simulation with the paper by Lucia and Xu3 and the development of single-variable methods (i.e., Newton and trust region methods) for solving equations in the complex plane, which were applied to the Soave-Redlich-Kwong equation of state (EOS) and an adiabatic continuous-stirred tank reactor (CSTR). The significant contribution of this first paper is the important observation and conjecture that non-zerovalued local minima (i.e., singular points) on the real line are saddle points in the complex plane. This was followed by two papers on the development and use of multivariable equation-solving methodssone by Lucia et al.4 that contains the numerical framework for steady-state complex domain process simulation, the details of which can be found in the Ph.D. thesis of Guo,5 and the other by Lucia and Taylor6 on the application of complex iterative methods to isothermal, isobaric (TP) flash calculations. Lucia and Wang7 then studied process dynamics in the complex domain and presented numerical results for a retrograde flash problem. These details can be found in the M.S. thesis of Wang.8 Following this, Sridhar and Lucia9 provided rigorous proof that singular points are always saddle points in the complex domain. This result is extremely important because it establishes the existence of paths from singular points to stationary points of lower norm and a strong theoretical foundation for global convergence to a solution. Taylor et al.10 used a complex domain Newton method to solve a number of “difficult” distillation problems while Gow et al.11 applied complex domain trust region methods to phase equilibria involving refrigerant mixtures. In many of these applications, real domain methods fail to converge whereas with complex domain methods iterates start out as realvalued, enter the complex domain, and converge to either a real- or complex-valued solution. More recently, Lucia and Liu12,13 studied and continue to study the behavior of complex domain numerical methods in singular regions and have developed first- and secondorder acceleration methods for quickly locating singular points, thereby providing improvements in algorithm efficiency. Some of the details of these results can be found in the M.S. and Ph.D. theses of Liu.14,15 Many of the previously cited papers contain considerable numerical experimentation throughout the basins of attraction for the given applications.
10.1021/ie990655c CCC: $19.00 © 2000 American Chemical Society Published on Web 05/02/2000
1714
Ind. Eng. Chem. Res., Vol. 39, No. 6, 2000
2. Algorithmic and Theoretical Foundation Complex domain process simulation is applicable to both steady-state and unsteady-state processes because dynamical equations can always be transformed into algebraic form using difference relations. Therefore, consider process model equations written in the compact form
F(Z,p) ) 0
(1)
where F is a function defined on the complex domain, Cn, and where Z and p denote complex-valued vectors of unknown variables and parameters of length n and m, respectively. Equation 1 can be easily rewritten in the iterative fixed-point form
Zk+1 ) G(Zk,p)
(2)
where G is a nonlinear iterative map that depends on F and is therefore also defined on Cn and k is an iteration counter. It is well-known that solutions or roots to eq 1, say Z*, correspond to fixed points of G (i.e., Z such that Z* ) G(Z*,p)). For example, Newton’s method has the fixed-point form
Zk+1 ) Zk - [J(Zk,p)]-1 F(Zk,p) ) GN(Zk,p)
(3)
where J(Z,p) denotes the Jacobian matrix of F evaluated at Z and p and where the subscript “N” denotes Newton’s method. Thus, the Newton step is defined by
∆ZkN ) -[J(Zk,p)]-1 F(Zk,p)
(4)
Note that the roots, solutions, or fixed-points and the elements of the Jacobian matrix can be real- or complexvalued. Other equation-solving methods (e.g., direct substitution, tearing algorithms, and trust region methods) can also be written as nonlinear iterative maps, although the actual functionality of G is sometimes implicit. Because we prefer to use trust region methods to guarantee convergence to a stationary point of the least-squares function FTF, it is useful to define the Cauchy (or steepest descent or gradient) direction which is given by
∆ZkC ) -[J(Zk,p)]T F(Zk,p)
(5)
It is also helpful to have an overview of complex domain trust region algorithms before presenting theoretical material on singularity. 2.1. Complex Domain Trust Region Algorithms. Equation-solving methods such as Newton’s method can exhibit both periodic and chaotic behavior from real- and complex-valued starting points, even on simple nonlinear problems (see Cayley,2 Fatou,16 Julia,17 and Curry et al.18) and this periodic/aperiodic behavior is directly related to the presence of singular points (i.e., any point at which the Jacobian matrix of F is rank-deficient, has a zero determinant, or has equivalently one or more zero eigenvalues). If iterates behave periodically or chaotically, then so does the two-norm (least-squares function) or any other norm of F. Stabilization procedures such as line searching and trust region methods, which force norm reduction at each iteration, avoid periodic and chaotic behavior. However, they do so at the risk of terminating at singular points that are local minima (or saddles) in the two-norm of F in the real domain (Lucia
and Xu3) for which the rate of convergence is quite often slow (Q-linear). The complex domain trust region methods of Lucia and co-workers are based on the so-called “dogleg” strategy of Powell,17 which is globally convergent to stationary points (i.e., local and global minima) of the least-squares function of F in the real domain. Local minima or saddle points in the real domain correspond to singular points of F. This guaranteed convergence to either a solution or a singular point of F is achieved by using a linear combination of Newton and steepest descent steps, norm reduction in F at each iteration, and an iteratively adjusted trust region that forces the algorithm to take steepest descent steps when serious difficulties are encountered. Steepest descent guarantees reduction in the norm of F for small trust region sizes and thus global convergence. However, it is important to emphasize that global convergence does not mean guaranteed convergence to a solution! It means convergence to either a solution or a singular point. This is a subtle and often misunderstood point. The way the dogleg strategy works is by using Newton steps when they are inside the trust region and simultaneously reduce the value of the least-squares objective function. Otherwise, either dogleg or steepest descent steps are used. In regions where numerical difficulties are not too severe (i.e., where the trust region size is small or where the Newton step is large in magnitude), dogleg steps are usually used. Dogleg steps essentially bend Newton steps in the direction of steepest descent steps in an effort to reduce the least-squares objective function. In regions of severe nonlinearity (e.g., near singular points), the trust region is deliberately made very small and steepest descent steps are used. This is because Newton steps tend to be “wild” in these regions, align with the null space of the Jacobian matrix, and generally take iterates well outside the trust region regardless of its size. Remember, it is these steepest descent steps that actually guarantee the global convergence of the dogleg strategy. We refer the reader to the original paper by Powell19 for a more complete treatment of the basic steps of the algorithm, the “rules” for both selecting Newton, dogleg, and steepest descent steps as well as adjusting the size of the trust region iteratively and the global convergence proofs. We use essentially the same rules in our complex domain trust region algorithms, which are really just complexified versions of the dogleg strategy with enhancements to avoid termination at singular points by identifying singular regions, quickly accelerating to the singular point, and then performing a full or partial eigenvalueeigenvector decomposition to find a point of lower norm. Once this point of lower norm is found, the algorithm returns to the complex domain dogleg strategy, terminating at either a solution or singular point, and this alternation between acceleration and eigenvalue-eigenvector decomposition and the complex domain dogleg strategy repeats itself until a solution is found. 2.2. Singularity. There are two key issues surrounding the presence of singular points: (1) calculating singular points quickly and (2) guaranteeing a path from a singular point to another stationary point of lower norm (either another singular point or a solution). 2.2.1. Accelerating to Singular Points. Because convergence to singular points can be slow, Lucia and Liu12,13 have developed simple tools for identifying singular regions and accelerating to singular points.
Ind. Eng. Chem. Res., Vol. 39, No. 6, 2000 1715
Singular regions are identified by the ratio test of Lucia and Liu12 which is given by the pair of inequalities
||∆ZkN||/||∆ZkC|| > M
(6)
||F(Zk,p)|| > ||[J(Zk,p)]T F(Zk,p)||
(7)
and
where M is some suitably chosen large positive number. This ratio test correctly characterizes singularity and has been observed to provide a good practical measure for identifying singular regions. Also, the parameter M is easy to choose. Once a singular region has been identified, the Newton step in the trust region algorithm is replaced by a second-order Newton step defined by
∆ZkN2 ) -(JTJ + A)-1 [J(Zk,p)]T F(Zk,p)
(8)
where J ) J(Zk,p) and A ) ΣFiAi and Ai is the secondderivative matrix (or its approximation) of the component function Fi. The matrix A in eq 8 can be approximated in several waysseither by finite difference second derivatives, a variety of quasi-Newton updates including the Powell-symmetric-Broyden (PSB), Broyden-Fletcher-Goldfard-Shanno (BFGS), or symmetric rank-one (SR1) updates or it can be split into analytically available and unavailable second derivatives and approximated by the class of thermodynamically consistent updates as in Venkataraman and Lucia.20 Note that while the second-order Newton-like step replaces the usual Newton step in the trust region strategy, the trust region strategy is still used to guarantee convergence to a singular point. However, second-order steps are usually taken near the singular point and thus the rate of convergence is usually at least Q-superlinear. 2.2.2. Guaranteeing a Path away from a Singular Point. Sridhar and Lucia9 proved the following important result regarding the existence of a path from a singular point to a stationary point of lower norm. Theorem 1. Nondegenerate singular points of any analytic function, F, correspond to saddle points of the real-valued function ||F||2 ) FTF. The details of the proof of this theorem can be found in Sridhar and Lucia9 (p 585-587) and rely on the use of Cauchy-Riemann conditions and Laplacian equations. This theorem rigorously establishes that singular points at which F * 0 are saddles and thus through them must pass directions of negative curvature. As stated earlier, this is probably the single most important result in complex domain process simulation for two reasons: (1) it rigorously guarantees that there is at least one path (i.e., along the eigenvector associated with the largest negative eigenvalue) that leads away from any singular point in a direction of lower norm. Thus, norm-reducing numerical methods guarantee that iterates cannot return to a previously determined singular point and can be forced to a solution through (a sequence of) decompositions, and (2) it is implementable through the use of eigenvalue-eigenvector calculations of the Hessian matrix of FTF. Initially, I thought that this theorem had serious drawbacks when applied to largescale problems because it would require the calculation of all eigenvalues and eigenvectors. However, recently I discovered other structural properties associated with the Hessian matrix of the two-norm of F that are related to analytic functions that require only the computation of the largest and/or smallest eigenvalues (and associ-
ated eigenvectors) of the Hessian matrix of FTF to determine the directions of negative curvature. I state these results here in the form of three theorems for the first time. Theorem 2. Let ∇2||F||2 be the Hessian matrix of the function ||F||2 ) FTF, where F(Z), for Z ∈ Cn, is analytic. Then,
(∂2||F||2/∂xi∂xj) + (∂2||F||2/∂yi∂yj) ) 0, i,j ) 1, 2, ..., n (9) This theorem has two important consequences: (1) each pair of diagonal elements of ∇2||F||2 sum to zero, and (2) each pair of elements (∂2||F||2/∂xi∂xj) and (∂2||F||2/ ∂yi∂yj) also sum to zero. By far the most common singular point computed in practice are non-zero-valued, real local minima of FTF. Unlike saddle points, real-valued local minima are located in convex regions and function as attractors. Here, I give a simple theorem for singular points restricted to the real subspace of Cn. Theorem 3. Let Z′ ∈ Cn be any singular point restricted to Rn. Then,
(∂2||F||2/∂xi∂yj) + (∂2||F||2/∂yi∂xj) ) 0, i,j ) 1, 2, ..., n (10) Theorems 2 and 3 show that real singular points result in the following interesting structure of ∇2||F||2
∇2||F||2 )
[
H 0 0 -H
]
(11)
where H is the n × n symmetric matrix whose elements are hij ) (∂2||F||2/∂xi∂xj). Now, I can state the main result. Theorem 4. Let Z′ ∈ Cn be a real singular point of T F F. If λ g 0 is an eigenvalue of ∇2||F||2 with the associated eigenvector p ) [(p1,0), (p2,0), ...]T, then -λ is also an eigenvalue of ∇2||F||2 with eigenvector q ) [(0,p1), (0,p2), ...]T. This theorem has two important practical consequences at real singular points that are local minima or saddles of FTF. First, there is no need to “extend” ∇2||F||2 to the complex domain when a real singular point is located. The eigenstructure of ∇2||F||2 can be obtained from the real, symmetric n × n matrix H. This saves both storage and arithmetic operations. Second, if all that is needed for computation is a direction of negative curvature, then only a few negative eigenvalues and associated eigenvectors need to be calculated. In particular, it is convenient to use techniques such as Rayleigh quotients, the power method, Lanzcos method, etc. to find λmax and λmin of H and their corresponding eigenvector rmax and rmin. The largest and smallest negative eigenvalues are then -λmax and -λmin and the associated directions of negative curvature are -rmax and -rmin. Thus, partial eigenvalue-eigenvector decomposition for large-scale problems is computationally tractable and requires only o(n) operations. Perturbations in any negative normalized eigenvector direction must be properly scaled to result in a decrease in the two-norm from its value at a singular point. Stepsizes or scaling factors can be determined from the following extrapolated Taylor series expansion of FTF:
FTF ) (FTF)′ + R(FTJ)v + (1/2)vT∇2||F||2v ) (FTF)′ + (1/2)R2vT∇2||F||2v (12)
1716
Ind. Eng. Chem. Res., Vol. 39, No. 6, 2000
since the gradient term FTJ ) 0 at any singular point and where v is any normalized eigenvector. Now, if v is associated with any negative eigenvalue, then
FTF ) (FTF)′ + (1/2)R2vT∇2||F||2v ) (FTF)′ - (1/2)R2λ(vTv) (13) because ∇2||F||2v ) -λv. Extrapolation to FTF ) 0, since we are looking for a zero-valued minimum in FTF, gives T
R ) sqrt[2(F F)′/λ]
(14)
since (vTv) ) 1 if v is a normalized eigenvector. Note however that eq 14 points to potential numerical difficulties due to poor scaling. For example, when the value of (FTF)′ is nonzero but small and the largest negative eigenvalue is large in magnitude, the resulting stepsize predicted by eq 14 can be very small or even below machine precision. Such small steps will not usually take the next iterate far enough away from the singular point to give rapid convergence to the next stationary point. 2.3. Process Feasibility. Process feasibility is closely related to singularity in some ways and the resolution of feasibility can be either a global or local issue. When product compositions or other specifications for a distillation are truly infeasible, no real solution (i.e., no zerovalued stationary point of FTF) exists and there will only be real-valued, singular non-zero minima of FTF. Eigendecomposition at the singular point will usually lead to a complex-valued solution, but in cases where the feasible region is a nonconvex set, it can be computationally demanding to resolve the issue of feasibility in a global sense. In addition, while it is possible to argue that no real- or complex-valued solutions exist for some problems, I have not observed this in the variety of process engineering applications my students and I have studied. On the other hand, when the real volume or compressibility roots to a cubic EOS in vapor-liquid equilibrium (VLE) calculations is being found, feasibility takes on a slightly different meaning. If the coefficients of the EOS are real, then a fundamental theorem of algebra guarantees that there are three roots and that at least one of them is real because the EOS is a polynomial. If specifications are such that only one real root exists, that root will correspond to either vapor or liquid and feasibility in this case is defined in terms of the existence of a particular type of root or phase (i.e., whether the specifications admit a single- or two-phase solution). The same is true in many other process engineering calculations (e.g., more complicated EOS, flash and distillation calculations, reaction engineering problems, etc.). A real solution always exists; it just may not be the anticipated or desired one. Complex domain trust region methods have finite step termination due to the combined global convergence properties of trust region methods and eigenvalueeigenvector decomposition. This finite step termination provides one way of partially resolving process feasibility. Calculation of a complex-valued solution to any set of process model equations clearly indicates the infeasibility of that particular solution for the given problem specifications. For example, the calculation of a complexvalued solution to any retrograde VLE flash problem clearly indicates that the associated flash specifications
are infeasible with respect to a two-phase solution and are in the single-phase region. See, for example, Gow et al.11 2.4. Process Dynamics. It is a straightforward matter to use difference equations to convert the differential-algebraic equations (DAEs) that typically model dynamical chemical processes into the nonlinear iterative map given by eq 2 at each “time” step. That is, a DAE system of the form
dZ/dt ) F(Z,p)
(15)
C(Z,p) ) 0
(16)
can be easily rewritten as eq 2 using a numerical integration method or difference equations for the derivative term dZ/dt, where the independent variable, t, represents time. Lucia and Wang7 have in fact done this and presented a theoretical framework for dynamical process simulation in the complex domain. However, it is important to understand that by dynamical process simulation in the complex domain I mean that given our present understanding all variables except the independent (or time) variable are permitted to take on complex values. Lucia and Wang7 applied this theoretical framework to a fixed vapor flow, isothermal (VT) flash problem in the retrograde region and showed that (1) the bifurcation of a pair of steady-state solutions into the complex domain lead to unstable dynamics in the real domain, as one might expect, (2) complex steady states can be dynamically stable, and (3) in situations where more than one complex steady-state exists, the fractal nature of the basin boundaries can give rise to unstable dynamics. 3. Advantages and Disadvantages of Complex Domain Simulation Like many things, complex domain numerical methods have both advantages and disadvantages in practice. 3.1. Advantages. The advantages of complex domain process simulation include (1) continuity and smoothness, (2) alternate pathways to solutions, (3) finite step termination of trust region methods to solutions, (4) relevance to issues surrounding feasibility/infeasibility, (5) annihilation of trivial solutions in VLE phase equilibria modeled by EOS, (6) the ability to allow negative values of flows, pressures, compositions, and other variables, and (7) the ease with which real domain codes can be converted into complex domain codes. Continuity and Smoothness. Figures 1 and 2, which are taken from Gow et al.,11 clearly show what I mean by continuity and smoothness (or differentiable). Note in Figure 1 that the real-valued solutions to a VT flash problem for any mixture of methane (R50)-ethene (R1150) is continuously and smoothly connected to solutions in the complex domain through one or more bifurcation points. In addition, Figure 2 shows that the two-norm and/or absolute value of F(Z) for the SoaveRedlich-Kwong (SRK) EOS is also smooth. Note the regularity of the level curves shown in Figure 2. Continuity and smoothness imply that Newton-like methods, which are of course derivative-based, can be used with confidence because numerical difficulties that would normally occur because of derivative discontinuities do not occur.
Ind. Eng. Chem. Res., Vol. 39, No. 6, 2000 1717
Figure 1. Isothermal VLE for R50(1)-R1150(2) at 223.15 K using the SRK equation.
Figure 3. Iterative compressibilities for isothermal VLE for R23(1)-R22(2) at 343.15 K using the SRK equation.
Figure 2. Vapor compressibility on second iteration for isothermal VLE for R23(1)-R(22) at 343.15 K using the SRK equation.
Alternate Pathways to Solutions. By their very nature, complex domain numerical methods permit both real- and complex-valued iterates. This suggests, among other things, that it should be possible to find examples in which calculations restricted to the real domain fail while those performed or allowed to enter the complex domain converge to a solution. Figure 3, also taken from Gow et al.,11 illustrates how an alternate pathway through the complex domain provides convergence to a real solution for a VT flash calculation for the binary refrigerant mixture trifluoromethane (R23)-chlorodifluoromethane (R22) using the SRK EOS. Note that the estimate of the vapor compressibility root starts out as real-valued, then becomes complex-valued, but eventually converges to a real-valued vapor root. In contrast, the liquid root begins as complex-valued but also converges to a real root. In a similar manner the associated iterative flash calculation variables (compositions, phase flows, and pressure) take on complex values, but they, too, converge to a real-valued solution. If, on the other hand, this same flash calculation is restricted to the real domain, many starting values converge to a trivial solution (i.e., equal phase compositions for vapor and liquid). Taylor et al.10 also give an example of a reboiled absorber in which real domain Newton’s method fails for many bottoms purity specifications but for which complex domain Newton’s method easily finds solutions.
Figure 4. Basins of attraction for Newton’s method for the SRK equation.
Finite Step Termination to Solutions. Unlike complex domain Newton’s method which can wander periodically or chaotically near singular regions, complex domain trust region methods share the finite step termination property of traditional trust region methods solely because of the added advantages provided by eigenvalue-eigenvector decomposition. This is most easily and accurately demonstrated using basins of attraction because they represent numerical behavior for a wide variety of starting points. Figures 4-6, which is an exercise in finding the roots of the SRK EOS for a 60-40 mol % mixture of N2-C14H30 and taken from Guo,5 illustrate the point quite well. Note that there is periodic behavior, in particular, period 4 behavior, for Newton’s method contained throughout the basin boundaries shown as the black region in Figure 4. Any starting point chosen from this black region, or Julia set, will eventually settle into a 4-cycle and remain there forever (barring rounding and truncation error). Figure 5 is the corresponding basins of attraction for a traditional trust region method without eigendecomposition. Note that many of the points in the basin boundary now converge to a singular point (or nonzero minimum in FTF) instead of behaving periodically because trust region methods
1718
Ind. Eng. Chem. Res., Vol. 39, No. 6, 2000
Figure 5. Basins of attraction for a traditional trust region method for the SRK equation.
Figure 6. Basins of attraction for a complex domain trust region method for the SRK equation.
force norm reduction. However, one difficulty has simply been traded for another. Figure 6, on the other hand, shows that all starting points converge to either a realor a complex-valued solution in a finite number of iterations when a complex domain trust region method with eigendecomposition is used. There is no periodic behavior because of norm reduction and no termination at a singular point due to the eigenvalue-eigenvector decomposition strategy that is part of our complex domain trust region method. Also, see Lucia et al.4 for an example of the numerical details associated with finite step termination of a complex domain trust region method for dew point temperature calculations. Feasibility/Infeasibility. Computation of complexvalued solutions often indicate infeasibility of certain types of solutions. Clearly, Figure 1 shows that convergence to a complex-valued solution on the right-hand side of the phase diagram indicates that the specifications (in this case overall feed composition) do not admit a real-valued two-phase VLE solution or that a realvalued two-phase solution is infeasible for the given specifications. My students, colleagues, and I have observed similar behavior in roots of EOS, in which either a real vapor or real liquid root is infeasible (or
Figure 7. Level curves for ||F||2 for a TP flash using a single realvalued EOS.
does not exist) for some specifications of temperature, pressure, and composition, in reaction engineering problems in which either a real-valued high-conversion or low-conversion solution does not exist (or is infeasible) for certain reaction parameters and in distillations in which real-valued high- or low-purity solutions are infeasible for certain column specifications. Annihilation of Trivial Solutions. In real domain calculations, it is common for very similar values of compressibility (or volume) roots to EOSs to be selected for a vapor and liquid, especially at high pressure where only one real physically meaningful root usually exists. This often causes equilibrium ratios (or K-values) to be very close to one and subsequent convergence of the phase equilibrium calculations to a trivial solution. The computation of all real and complex solutions to EOS always allows for distinctly different compressibility roots to be selected for a liquid and vapor. This in turn gives unequal fugacity coefficients and equilibrium ratios and results in the annihilation of trivial solutions. Figures 7 and 8 taken from Sridhar and Lucia9 give a clear illustration of the annihilation of trivial solutions for the case of an isothermal, isobaric (TP) flash in the critical region for the mixture N2-C14H30 using the Soave-Redlich-Kwong EOS. Note the family (or valley) of trivial solutions in Figure 7 which runs essentially perpendicular to one side of the triangle that represents a projection of the level curves of ||F(Z)|| onto the component mass balances when only real-valued roots are permitted at the EOS level of the calculations. In contrast, when complex-valued roots to the EOS are allowed, distinctly different liquid and vapor roots can be selected and the family of trivial solutions no longer exists. Moreover, the level curves in Figure 8 are monotonically decreasing in every direction toward the nontrivial solution. In my opinion, this is a very important result because it removes all concerns associated with trivial solutions. Lucia and Luo21 have recently shown that this annihilation is also possible for more complicated EOS such as the SAFT (statistical associating fluid theory) equation. Negative Flows, Compositions, and Other Variables. In many chemical process simulation environments safeguards must usually be used to avoid com-
Ind. Eng. Chem. Res., Vol. 39, No. 6, 2000 1719
Figure 8. Level curves for ||F||2 for a TP flash using a single complex-valued EOS.
Figure 9. F(Z) vs Z for the function F(Z) ) 1 - Z ln(3Z).
putational difficulties caused by negative values of flows, pressures, temperatures, and compositions. Many models involve functions of the form ln G(Z)sfunctions such as ln γi(x,T), ln φi(y,T,P), K-values, and other thermophysical propertiesswhich may not make sense or contain undefined terms for negative variable values. To illustrate the point, consider the task of finding the roots to the simple single-variable function
F(Z) ) 1 - Z ln(3Z) ) 0
(17)
on the interval (0, 1] so the variable Z can be thought of as a composition. This function has a single realvalued solution at Z* ) 0.95246 and a singular point at Z′ ) 1/(3e) ) 0.122 63, which is a maximum since F′′(Z′) ) -1/Z′ ) -8.155. Figure 9 shows a plot of this function on the interval (0,1]. It is also easy to see that real domain Newton’s method initialized at any point on the interval (0,Z′) ) (0, 0.12263) will produce a second iterate that is negative, at which the real-valued functions ln(3Z) and F(Z) are undefined. As a result, real Newton’s method terminates because of a Fortran subroutine library system error. If, on the other hand, a complex domain version of Newton’s method is used to solve the same problem, the complex-valued functions ln(3Z) and F(Z) are defined for all values of Z, even negative values, and complex domain Newton’s method finds the real solution Z* in under 10 iterations for
almost all initial values on the interval (0,Z′), even though some of the iterates are negative. Ease of Conversion of Real Domain Codes. It is a relatively simple matter to convert a real domain computer implementation of any algorithm to a complex domain version. My students and I have done this for matrix-vector utility programs, nonsymmetric and symmetric linear equation solvers using LDU and LDLT factorizations, and obviously nonlinear equation-solving methods such as direct and accelerated direct substitution, Newton’s method, and trust region methods. To do this in Fortran, one need only change the type of declaration of many of the real variables from say REAL*8 to COMPLEX*16 if double-precision arithmetic is desired and change any relevant Fortran library function calls (i.e., DLOG, DSQRT, etc.) to their complex counterparts (i.e., CDLOG, CDSQRT, etc.). Arithmetic assignment and data statements are easily converted by setting the appropriate variables to (X,Y) which is interpreted as X + Yi where X and Y are of desired precision. 3.2. Disadvantages. As with anything else, there are also disadvantages to complex domain numerical methods. The disadvantages include (1) increased storage demands, (2) an increase in the number of arithmetic operations, and (3) increased cost of simulation. Storage. Complex codes usually require roughly twice the real variable storage as real codes simply because both the real and imaginary parts of any complex variable Z ) X + Yi need to be stored. Integer storage for things such as index topologies for sparse linear equation solvers and related tasks may also increase. Numerical Operations. Arithmetic operations in the complex domain increase because both real and imaginary parts of any complex number must be considered in additions and multiplications. Extensions of sparse linear algebra to the complex domain are straightforward and help reduce unnecessary arithmetic operations. Simulation Cost. Taylor et al.10 report relative computer times for real and complex domain versions of Newton’s method for a set of distillation examples that show that complex versions of the same algorithm can take between 1.5 and 2.8 times longer for the same problem from the same starting point. However, this should be interpreted carefully because successful convergence in both domains is necessary to make a time comparison. That is, there are many cases in which complex versions of a particular algorithm solve a given problem when real versions do not. See the reboiled absorber example (pp 95-96) in Taylor et al.10 for an illustration of this. 4. Future Directions Complex domain process simulation is a new subject area. As I see it, theoretical and algorithmic research is needed along with developments in industrial practice and extensions to other related process engineering areas (i.e., dynamics, control, and optimization) for complex domain process simulation to gain widespread acceptance. 4.1. Extensions to Other Process Engineering Areas. It is quite natural to consider extensions of complex domain methods to unsteady-state process
1720
Ind. Eng. Chem. Res., Vol. 39, No. 6, 2000
simulation, steady-state and dynamical process optimization, process control, and other areas. Much of the mathematical theory and algorithmic framework we have developed for nonlinear algebraic equations carries over to all of these areas, largely because of the natural isomorphism between Cn and R2n. However, care must be exercised in developing algorithms where signs of variables or other quantities are used to make decisions (e.g., in exchanging basic and nonbasic variables in simplex tableaus, adding and deleting inequalities in active set strategies, etc.). Also, the underlying physical meaning of complex-valued trajectories, singular points, and/or solutions should always be sought. Complex Domain DAE Systems and Dynamical Process Simulation. Differential/algebraic equation systems arise from relaxation methods for solving steady-state equations and in dynamical process simulation. Lucia and Wang7 have taken a first step in extending steady-state process simulation to dynamical systems by successfully developing a complex domain integration algorithm for solving DAE systems based on backward (or implicit) Euler integration. However, many more explicit and implicit integration algorithms need to be extended to the complex domain to determine if complex domain dynamical process simulation holds any real promise as a tool for the future. From the standpoint of a strict mathematical tool for reaching steady states by integrating differential equations, complex domain integration methods obviously provide greater flexibility (i.e., trajectories) than real domain integration methods. For example, one of my Ph.D. students has recently shown that it is possible to start with real initial conditions and to have trajectories (or iterates) enter and reside in the complex domain and eventually converge to a real steady state. On the other hand, the physical meaning, if any, of trajectories that (partially) reside in the complex domain is an open issue. Other open issues include complex domain process dynamics near, through, and/or to singularities, which have applications to such things as start-up, shutdown, failure analysis, and other areas, and integration of nonsmooth DAE models. Complex Domain Process Optimization. Obvious complex domain extensions of unconstrained optimization, linear programming (LP), nonlinear programming (NLP), and mixed-integer nonlinear programming (MINLP) methods are easy to envision. Complex versions of the revised simplex method in linear programming, unconstrained quasi-Newton methods such as the Broyden-Fletcher-Goldfard-Shanno (BFGS) update, NLP methods such as successive quadratic programming (SQP), generalized reduced gradient (GRG), and interior point (IP) methods for constrained optimization, branch and bound, and generalized Benders decomposition for global optimization and others are easy to imagine. However, I cannot decide if I am enthusiastic about the usefulness of such an approach for optimization or not. From a strictly mathematical perspective extension of optimization methods to the complex domain poses no obvious difficulties, again largely because of the isomorphism between Cn and R2n. Moreover, constraint infeasibilities in the real domain may be easy to resolve in the complex domain because of the presence of complexvalued solutions. On the other hand, my reservations stem from two basic concerns: (1) the use of complexvalued objective functions and (2) the use of sign- and magnitude-related decisions when inequality constraints
and multipliers are complex-valued (e.g., in exchanging basic and nonbasic variables in simplex methods, in adding and deleting inequalities in active set methods, etc.). Relative objective function values in the complex domain can be “compared” through the use of the complex absolute value function or an equivalent norm that maps C into R. That is, any measure of “better” or “best” must be based on the composite map ||F(Z)||: Cn f C f R. Typically, “distances” or metrics are measured using a suitable norm so constraint violations can also be handled by various norms on Cn. My biggest concern regarding complex domain optimization comes from the fact that the concept of sign in the complex domain is unclear to me. Many decisions in many optimization techniques are based on signs of coefficients, multipliers, and/or other quantities. In these cases, it may be prudent to restrict the certain variables to be realvalued, when possible, and use some sort of hybrid approach where some of the variables are complexvalued while others are real. In fact, this is the approach we have used in extending trust region methods to the complex domain in which the parameter for computing the dogleg step is restricted to being real-valued. Complex-valued objective functions, constraints, and Lagrange and Kuhn-Tucker multipliers in constrained optimization also complicate the algebraic/geometric properties of the Lagrangian function. Process Control in the Complex Domain. Process control is a natural extension of process dynamics. Because many traditional topics in control such as root locus diagrams and frequency domain techniques such as Bode plots and Ziegler-Nichols diagrams are complex domain techniques, extensions of process control theory and practice to the complex domain seem all the more natural. However, because the value of complex domain process dynamics is unproven, I suggest that complex domain process dynamics and control be studied togethersin much the same way that they have become part of undergraduate and graduate education and research in chemical engineering. Here again, I have mixed feelings about extensions to the complex domain. While the mathematical tools are readily available and all of the advantages and disadvantages discussed earlier carry over, I have concerns related to the physical meaning and realizability of complex modes of traditional and modern control (i.e., P-only, PI, PID, model predictive and internal model control). 4.2. Theoretical/Algorithmic Research. There are also many theoretical research needs. However, I have always believed that the only theoretical research worth doing is that which provides implementable algorithmic development. Here, I outline a set of core theoretical/ algorithmic issues common to many extensions of process engineering to the complex domain: process simulation, process dynamics and control, and process optimization. Stabilization Techniques. My students and I have successfully extended trust region methods for solving nonlinear algebraic equations to the complex domain. However, there are many other stabilization techniques and a variety of different contexts in which they are used. Line-searching methods are commonly used in unconstrained and constrained optimization to monitor progress toward an optimum. Several line search (or merit) functions exist for this purpose (e.g., the objective function, penalty and barrier functions, augmented Lagrangian functions, etc.) with varying mathematical
Ind. Eng. Chem. Res., Vol. 39, No. 6, 2000 1721
properties (i.e., exact or inexact and either differentiable or nondifferentiable). The choice of merit functions in extensions to the complex domain is an open issue, although one obvious choice is to use the complex absolute value function to map the current set of merit function from C to R. However, other choices may exist and should be explored. In addition, merit functions used in constrained optimization which often use multipliers to weigh constraint penalties present additional difficulties because the concept of sign in the complex domain is unclear. Similar statements can be made about line searching in nonlinear equation solving where the “objective function” is typically the leastsquares function. Trust region methods for nonlinearly constrained optimization often cause difficulties that manifest themselves as linearized constraint infeasibilities (i.e., the linearized constraints including the trust region constraint define an empty feasible region in the real domain). I believe extension of this class of stabilization techniques to the complex domain may have the potential to remove difficulties associated with constraint infeasibilities because of the presence of complex-valued solutions. That is, it may be possible to move into the complex domain when constraint infeasibilities occur on a given iteration in the real domain and subsequently re-enter the real domain on some later iteration when infeasibility is no longer a concern. Quasi-Newton Matrix Updating. Quasi-Newton methods such as the Broyden, Powell-symmetricBroyden (PSB), Davidon-Fletcher-Powell (DFP), and BFGS updates are still widely used in process simulation and optimization, even with the recent shift toward equation-oriented architectures and the use of analytical partial derivative information. They will continue to be used in situations where proprietary “procedures” for unit operation models and/or thermophysical properties are part of the flowsheet. The performance and usefulness of nonsymmetric and symmetric complex domain quasi-Newton updates is largely untested and thus a study of the numerical behavior of complex domain quasi-Newton updates would be useful. The single biggest open issue here revolves around the use of standard secant and other constraints and the interpretation of quasi-Newton updates as least change updates. For an n × n nonsymmetric, complex-valued matrix subject to only secant conditions, there are 2n2 unknowns (i.e., the real and imaginary parts of the elements of this complex-valued matrix) and n secant conditions. Thus, there is more freedom in a complex domain quasi-Newton matrix than there is in its real counterpart (i.e., 2n2 - n compared to n2 - n). There is also added freedom when there are thermodynamic constraints such as the Gibbs-Duhem and GibbsHelmholtz equations (see Venkataraman and Lucia22,23) and when the matrix is symmetric. It would be useful to study how this additional freedom affects the quality and numerical performance of quasi-Newton updates in both nonsymmetric and symmetric equation solving. Pivoting. The algebra of matrix factorization methods for solving nonsymmetric and symmetric systems of linear equations easily carries over to the complex domain. We have certainly experienced no computational difficulties in this regard for n × n linear systems and thus it appears that analogues for selecting pivots using the magnitude of complex numbers also work well. On the other hand, the concept of pivoting is somewhat different in simplex and revised simplex methods for
linear and successive linear programming. Here, both signs and magnitudes of “pivots” are used to exchange basic and nonbasic variables to move from one vertex (or feasible solution) to another and reduce the value of the linear objective function. Because the coefficients for both the constraints and objective function play roles in selecting the variables for this exchange, it is not at all clear to me that standard rules governing pivot selection would remain unchanged when the coefficients were complex-valued. Active Set Methods. Extensions of active set methods to the complex domain also raise many interesting theoretical and algorithmic questions. The heart of any active set method is the strategy used to add and delete inequalities from the active set and, in most methods, inequality constraints are added to and deleted from the active set, usually one at a time, on the basis of relative constraint violations and signs of Kuhn-Tucker multipliers, respectively. For example, in convex quadratic programming, the inequality with the smallest violation is added to the active set while the inequality with the largest negative Kuhn-Tucker multiplier is deleted from the active set. Here again, it is entirely unclear to me if using the relative magnitude of complex numbers in the current rules for constraint addition and deletion would be appropriate in cases where complex-valued unknown variables and/or multipliers are allowed. While using the relative magnitude (or norm) of complexvalued constraint violations as a means of adding an inequality to the active set seems plausible, it is unclear and worthy of investigation. Again, the concept of sign in the complex domain is clouded for complex-valued Kuhn-Tucker multipliers. How this would impact constraint deletion is also unclear. Similar concerns exist for all active set methods in mathematical programming, whether they are part of feasible or infeasible path algorithms. Constraint Feasibility/Infeasibility. Constraint feasibility/infeasibility is an important concern in both nonlinear equation solving and optimization. Some results with regard to feasibility/infeasibility in nonlinear equation solving have been discussed in section 3.1. Complex domain methods may have great potential for removing theoretical and computational difficulties associated with infeasibilities in (nonlinearly) constrained optimization as well because of the existence of complexvalued solutions. In particular, when no real-valued feasible solution to a system of constraint equations and/ or inequalities exists, it may be possible to find complexvalued solutions and continue the computations. On the other hand, it is unclear if subsequent calculations will always re-enter the real domain or preclude convergence to a desired solution. This, of course, depends on the algebraic structure of the solutions to the particular system of equations and/or inequalities under consideration. For example, for some systems of nonlinear algebraic equations, the calculations often terminate at a complex-valued solution, even when other real solutions exist. Moreover, for nonconvex feasible regions only local resolution can be expected. However, I believe a complex domain approach to constraint feasibility/ infeasibility is an intriguing idea worth pursuing. Acceleration Methods and Saddle Point Theory. Quadratic acceleration and the current theory surrounding saddle point decomposition at singular points (i.e., Sridhar and Lucia9 and this paper) are valuable because they provide both theoretical rigor and imple-
1722
Ind. Eng. Chem. Res., Vol. 39, No. 6, 2000
mentable strategies for moving quickly to and then from singular points to solutions. While these tools are useful, more efficient eigendecomposition and/or ways of approximating curvature are needed, particularly for large problems. Methods that do not require computing and/ or storing the Hessian matrix of ||F||2 during acceleration and saddle point decomposition would be invaluable. For example, conjugate gradient or other Krylov subspace techniques could be used to accelerate to a singular point and perhaps subspace information generated by these methods during the acceleration phase could be used to move away from a singular point as well without the need for eigendecomposition. Analogues of the current saddle point theory, if they exist, could provide new methods for global optimization. In particular, if local optima in the real domain are saddle points in the complex domain, then perhaps paths to global optima through the complex domain might exist. This is certainly the case for the leastsquares function (or any other norm on Cn) when local minima correspond to singular points of JTJ (or part of the least-squares Hessian matrix), even though the Hessian matrix is nonsingular. On the other hand, there is again the concern for some sort of guarantee that iterates will re-enter the real domain and converge to a real-valued global minimum when one or more exists. I believe it would be interesting to explore these ideas to see what theoretical and algorithmic results could be found. 4.3. Industrial Applications. I firmly believe that there is a strong need for researchers other than me, my students, and co-workers to evaluate the usefulness of complex domain process simulation, process dynamics, and process optimization and to compare complex domain numerical methods and complex domain process engineering tools to current real domain practices. Perhaps what is most needed to make these comparisons meaningful are industrial applications or examples. The work of Taylor et al.10 does contain one industrial application (a reboiled absorber) that was considered very difficult to solve because of purity specifications in the bottoms product and those results clearly show that complex domain process simulation is more reliable, more robust, and more efficient than real domain process simulation. However, many more and varied industrial applications are needed before any hard and fast statements can be made about which approach is better. Clearly, for complex domain process engineering to gain acceptance, it must be more reliable and/or more robust than its real domain counterpart on a regular basis. Complex domain methods must either solve more problems than real domain tools or, in cases where both types of algorithms work, they must be more efficient. Otherwise, complex domain process engineering will never be widely used. Cadenza I would like to leave you with one thought. I am aware of the fact that many people intuitively feel that performing calculations in the complex domain is akin to opening a can of worms. The thought certainly crossed my mind in the late 1980s when my students and I first started looking at process simulation in the complex domain. However, with persistence and rigorous mathematics, I believe solid advances in theory and practice in many areas of process engineering can be made.
Acknowledgment I would like to thank many of my former graduate students (J. Xu, X. Wang, X. Guo, and D. Liu), my former Ph.D. student and postdoctoral fellow, L.N. Sridhar, and my colleagues R. Taylor and A. S. Gow for their contributions to this work. I would also like to thank the National Science Foundation for their continued support of this work through Grants CBT9007854, CTS-9409158, CTS-9696121, and CTS-9818130 and the U.S. Department of Energy, Office of Basic Energy Sciences for their support under Grant DEFG02-86ER13552. Literature Cited (1) Lucia, A.; Guo, X.; Richey, P. J.; Derebail, R. Simple Process Equations, Fixed-Point Methods, and Chaos. AIChE J. 1990, 36, 641. (2) Cayley, A. The Newton-Fourier Imaginary Problem. Am. J. Math. 1879, 2, 97. (3) Lucia, A.; Xu, J. Global Minima in Root Finding. In Recent Advances in Global Optimization; Floudas, C. A., Pardalos, P. M., Eds.; Princeton University Press: Princeton, NJ, 1992. (4) Lucia, A.; Guo, X.; Wang, X. Process Simulation in the Complex Domain. AIChE J. 1993, 39, 461. (5) Guo, X. Issues in Complex Domain Process Simulation. Ph.D. Thesis, Clarkson University, Potsdam, NY, 1995. (6) Lucia, A.; Taylor, R. Complex Iterative Solutions to Process Model Equations? Comput. Chem. Eng. 1992, 16S, 387. (7) Lucia, A.; Wang, X. Complex Domain Process Dynamics. Ind. Eng. Chem. Res. 1995, 34, 202. (8) Wang, X. Chaos in Several Dimensions, M.S. Thesis, Clarkson University, Potsdam, NY, 1995. (9) Sridhar, L. N.; Lucia, A. Process Analysis in the Complex Domain. AIChE J. 1995, 41, 585. (10) Taylor, R.; Achuthan, K.; Lucia, A. Complex Domain Distillation Calculations. Comput. Chem. Eng. 1996, 20, 93. (11) Gow, A. S.; Guo, X.; Liu, D.; Lucia, A. Simulation of Refrigerant Phase Equilibria. Ind. Eng. Chem. Res. 1997, 36, 2841. (12) Lucia, A.; Liu, D. An Acceleration Method for Dogleg Methods in Simple Singular Regions. Ind. Eng. Chem. Res. 1998, 37, 1358. (13) Lucia, A.; Liu, D. More Process Simulation in Singular Regions. Comput. Chem Eng. 1999, 235, 5367. (14) Liu, D. Improving the Numerical Performance of the Complex Dogleg Method in Singular Regions. M.S. Thesis, University of Rhode Island, Kingston, RI, 1997. (15) Liu, D. Chemical Process Simulation in Singular Regions. Ph.D. Thesis, University of Rhode Island, Kingston, RI, 2000. (16) Fatou, P. Sur les Equations Fonctionelle. Bull. Soc. Math France 1919, 47, 161. (17) Julia, G. Memoir sur l’Iteration des Function Rationelles. J. Math Pures Appl. 1918, 4, 47. (18) Curry, J. H.; Garnett, L.; Sullivan, D. On the Iteration of a Rational Function: Computer Experiments with Newton’s Method. Commun. Math. Phys. 1983, 91, 267. (19) Powell, M. J. D. A Hybrid Method for Nonlinear Equations. In Numerical Methods for Nonlinear Algebraic Equations; Rabinowitz, P., Ed.; Gordon and Breach, Amsterdam, 1970. (20) Venkataraman, S.; Lucia, A. Solving Distillation Problems by Newton-like Methods. Comput. Chem. Eng. 1988, 12, 55. (21) Lucia, A.; Luo, Q. Binary Refrigerant-Oil Phase Equilibrium Using the SAFT Equation. Adv. Environ. Res. 1999, submitted for publication. (22) Venkataraman, S.; Lucia, A. Exploiting the Gibbs-Duhem Equation in Separation Calculations. AIChE J. 1986, 32, 1057. (23) Venkataraman, S.; Lucia, A. Avoiding Variable Scaling Problems Using the Gibbs-Helmholtz Equation. Comput. Chem. Eng. 1987, 11, 73.
Received for review September 1, 1999 Accepted March 9, 2000 IE990655C