Reduced Order Model Based on Principal Component Analysis for

Nov 11, 2008 - Morgantown, West Virginia 26507, Ansys, Inc., Lebanon, New Hampshire 03766, and Ansys, Inc.,. Morgantown, West Virginia 26505...
1 downloads 0 Views 2MB Size
Energy & Fuels 2009, 23, 1695–1706

1695

Reduced Order Model Based on Principal Component Analysis for Process Simulation and Optimization† Yi-dong Lang,‡,§ Adam Malacina,‡,§ Lorenz T. Biegler,*,‡,§ Sorin Munteanu,| Jens I. Madsen,⊥ and Stephen E. Zitney§ Department of Chemical Engineering, Carnegie Mellon UniVersity, Pittsburgh, PennsylVania 15213, Collaboratory for Process and Dynamic Systems Research, National Energy Technology Laboratory, Morgantown, West Virginia 26507, Ansys, Inc., Lebanon, New Hampshire 03766, and Ansys, Inc., Morgantown, West Virginia 26505 ReceiVed NoVember 11, 2008. ReVised Manuscript ReceiVed January 12, 2009

It is well-known that distributed parameter computational fluid dynamics (CFD) models provide more accurate results than conventional, lumped-parameter unit operation models used in process simulation. Consequently, the use of CFD models in process/equipment co-simulation offers the potential to optimize overall plant performance with respect to complex thermal and fluid flow phenomena. Because solving CFD models is time-consuming compared to the overall process simulation, we consider the development of fast reduced order models (ROMs) based on CFD results to closely approximate the high-fidelity equipment models in the co-simulation. By considering process equipment items with complicated geometries and detailed thermodynamic property models, this study proposes a strategy to develop ROMs based on principal component analysis (PCA). Taking advantage of commercial process simulation and CFD software (for example, Aspen Plus and FLUENT), we are able to develop systematic CFD-based ROMs for equipment models in an efficient manner. In particular, we show that the validity of the ROM is more robust within well-sampled input domain and the CPU time is significantly reduced. Typically, it takes at most several CPU seconds to evaluate the ROM compared to several CPU hours or more to solve the CFD model. Two case studies, involving two power plant equipment examples, are described and demonstrate the benefits of using our proposed ROM methodology for process simulation and optimization.

1. Introduction Steady-state process simulators typically solve systems of nonlinear equations consisting of mass and energy balances, thermodynamic relationships, and phase and chemical equilibrium calculations. However, process simulations typically do not consider equipment geometry or fluid dynamics inside the equipment. For example, the unit operation model of a continuous stirred tank reactor (CSTR) ignores the influence of fluid dynamics and assumes that the contents of the reactor are perfectly mixed. Additionally, unit operation models in process simulators do not consider operating parameters related to mixing and fluid flow, such as the shaft speed in a CSTR. Instead, computational fluid dynamics (CFD) is required to focus on the microscale phenomena within process equipment items. In process design, analysis, and optimization, it is not uncommon that engineers are interested not only in process design but also in equipment design, because the equipment affects the overall process performance significantly and also has a potentially large impact on capital cost and therefore process economics. For example, shaft speed affects CSTR conversion and yield; overall chemical process performance, nozzle design, and dense mul† Disclaimer: Any opinions, findings, conclusions, or recommendations expressed herein are those of the author(s) and do not necessarily reflect the views of the Department of Energy (DOE). * To whom correspondence should be addressed. Fax: (412) 268-7139. E-mail: [email protected]. ‡ Carnegie Mellon University. § National Energy Technology Laboratory. | Ansys, Inc., Lebanon, NH. ⊥ Ansys, Inc., Morgantown, WV.

tiphase flow are critical for determining gasifier exit conditions and overall power plant performance; and multiphase flows are essential for spray dryers in food-processing applications. Moreover, striking an economic balance between mechanical equipment cost versus performance loss from poor fluid flow and heat and mass transfer is essential to explore at the design stage. For advanced process design, analysis, and optimization, we consider here the use of CFD models within a process/equipment “co-simulation”.1,2 Since 2000, co-simulation technology has been developed by the Department of Energy’s (DOE) National Energy Technology Laboratory (NETL) and its research collaborators.3,4 The team has developed the advanced process engineering co-simulator (APECS), a co-simulation software framework that integrates CFD-based equipment models with process simulators using the process industry CAPE-OPEN (CO) software standard.1 With APECS, one can integrate COcompliant equipment models based on CFD (e.g., FLUENT) (1) Zitney, S. E. CAPE-OPEN integration for CFD and process co-simulation. Proceedings of the American Institute of Chemical Engineers (AIChE) 2006 Annual Meeting, 3rd Annual U.S. CAPE-OPEN Meeting, San Francisco, CA, Nov 12-17, 2006. (2) Zitney, S. E.; Syamlal, M. Integrated process simulation and CFD for improved process engineering. Proceedings of the European Symposium on Computer Aided Process Engineering-12 (ESCAPE-12); Grievink, J., van Schijndel, J., Eds.; The Hague, The Netherlands, May 26-29, 2002; pp 397-402. (3) Sloan, D. G.; Fiveland, W. A.; Osawe, M. O.; Zitney, S. E.; Syamlal, M. Demonstration of coupled cycle and CFD simulations over a LAN. Clearwater Coal Conference, 2005. (4) Sloan, D.; Fiveland, W.; Zitney, S. E.; Osawe, M. O. Plant design: Integrating plant and equipment models. Power Mag. 2007, 151, 8.

10.1021/ef800984v CCC: $40.75  2009 American Chemical Society Published on Web 02/19/2009

1696 Energy & Fuels, Vol. 23, 2009

with CO-compliant process simulators, such as Aspen Plus, HYSYS, gPROMS, and COCO (www.cocosimulator.org). With co-simulation, the results of the process design or optimization are closer to industrial reality. For instance, Zitney and Syamlal2 presented the benefits of using APECS to optimize the shaft speed of a CSTR for a reaction-separation-recycle process, which could not have been accomplished using CFD or process simulation alone. It is evident that CFD models are very time-consuming and memory-intensive, especially for large, complicated equipment items. For instance, CFD simulations can require hours or days of CPU time for power plant equipment items, such as combustors, gasifiers, synthesis gas coolers, heat recovery steam generators, and fuel cells. When embedding CFD models within a recycle or optimization loop, the costs of the CPU time and memory may be prohibitive. One of the solutions for this limiting drawback is the application of a reduced order model (ROM) to approximate the rigorous CFD model. In this research, we propose an algorithm based on principal component analysis (PCA) and input/output mapping to target the specific needs of CFD-based ROMs. The PCA-based ROMs are data-driven and not equation-driven. They are developed using all of the information stored in the CFD solutions. With the developed ROM, we are able to obtain the state values in the flow field and the outputs from the equipment item corresponding to any specified input in the domain. To examine the effectiveness and efficiency of our proposed ROM methodology, we conducted two case studies involving the use of comprehensive CFD models for two power plant equipment items, namely a gas turbine combustor and an entrained-flow coal gasifier. In addition to PCA, a comprehensive set of computational techniques is invoked in these case studies, including experimental design [e.g., Latin hypercube sampling (LHS)], singular value decomposition (SVD), neural networks (NNET), and visualization of data scatter. In the combustor case study, a 2D CFD model with nine inputs is investigated. The results from 128 simulations of the combustor CFD model are evaluated to formulate the database, and an additional 45 CFD simulations are implemented for ROM evaluation. In the gasifier case study, a 3D CFD model with three inputs is studied. A total of 30 cases of the gasifier model are simulated, and the results of 28 cases among them are manipulated for PCA-based ROM development. Upon testing and evaluating the ROMs within a specified domain of input spaces, we conclude that the proposed PCA-based methodology for ROM is realizable and sufficiently accurate for use in APECS co-simulations and optimizations that require accurate CFD-based equipment models. In the next section, we will systematically describe the PCAbased ROM approach with a simple example of water flow in a manifold. Section 3 presents the combustor and gasifier case studies to support our proposed methodology. Finally, Section 4 concludes the paper and outlines directions for future work.

Lang et al.

new orthogonal coordinate system, such that the greatest variance by any projection of the data lies in the first coordinate, i.e., the first principal component (PC), the second greatest variance by any projection of the data lies in the second PC, etc. PCA is a common technique for reducing the number of variables and detecting structure in the relationships between variables in multidimensional data sets. This provides a transformation of the original data space spanned by the PCs, which are basis vectors. Using vector field data taken from the solution of a CFD model, we assemble these data to form a snapshot matrix X ∈ Rm×n. Any column Xi, i ) 1, n in matrix X can be decomposed as a linear combination of its PCs n

Xi )

∑ Ψ r and X ) ΨR j i

The columns of the matrix Ψ ∈ Rm×n are PCs, and the columns in the matrix r are scores that are the coordinates of the data elements in transformed vector space. The PCs are the eigenvectors of matrix XXT that are arranged according to their corresponding eigenvalues in descending order, λ1 > λ2 >... > λn. There are a few methods to calculate the PCs, such as SVD, Lanczos method, and Arnoldi method.6 Among them, SVD is the most direct. For instance, matrix X can be decomposed by SVD after eliminating zeros as X ) W ΣVT

(2)

where diagonal matrix Σ ) diag(s1, s2, ..., sn) and square matrix V ∈ Rn×n are the eigenvectors of XTX with the eigenvalues λj, j ) 1, n, which are related to the SVD as λj ) sj2, that is, the variance in the data set captured by the corresponding PC. Reducing eq 2 to rank r resulting from the cutoff criterion in eq 6, we express the reduced order data set as X ≈ Xˆ ) W(r)Σ(r)(V(r))T and Xˆ ) Ψr

(3)

where the superscript (r) indicates that the first r columns are taken from the original matrix to formulate a new matrix and the reduced score matrix is r ) Σ(r)(V(r))T ∈ Rr×n

(4)

Then, any column Xi in data set X can be expressed with reduced order space as Xi ∈ Rm ) Ψri

(5)

The dominant variances in the PC space that are captured by the reduced order data set can be determined from r

Rr )



r

∑s

2

∑λ ∑s

2

λj

j)1 n

j

)

j)1 n

j

2. PCA Strategy for ROM Development

(1)

j)1

j)1

e1

(6)

j

j)1

2.1. Fundamentals of PCA. PCA is probably the oldest and best known multivariable analysis technique. It was first introduced by Pearson in 1901 and developed thereafter by Hotelling in 1933.5 PCA is a powerful statistical tool to simplify a multidimensional data set to a lower dimensional analysis. PCA is a linear transformation that transforms the data to a

2.2. PCA-Based ROM Approach. To discuss the proposed methodology of the PCA-based ROM, we first give the notations and their definitions followed by the implementation of PCA on the given data set to obtain reduced PCs and scores. Then, we will discuss the mappings from input space into scores for state space and output space, respectively. Finally, we give the algorithm to summarize the methodology. It is worth noting

(5) Jolliffe, I. T. Principal Component Analysis; Springer-Verlag: New York, 2002.

(6) Golub, G. H.; Van Loan, C. F. Matrix Computations; John Hopkins University Press: Baltimore, MD, 1996.

ROM Based on PCA

Energy & Fuels, Vol. 23, 2009 1697

that there are three spaces introduced: input space, state space, and output space. 2.2.1. Vectors, Spaces, and State-Space Model. We first define in- and output spaces for a CFD equipment model. An input vector u in the input space contains the values of all variables for a given inlet boundary condition, i.e., u ) [Tin, Pin, f1, ..., fnci, ...]T, where Tin and Pin are the temperature and pressure of the streams entering the equipment item, respectively, and the elements fi, i ) 1, nci, are flow rates of the chemical species in the input stream. Furthermore, the input space can also contain CFD-related operating parameters, for example, shaft speed in a CSTR. Similarly, the output space is spanned by all variables in output streams leaving the equipment item. An output vector y in the output space contains the magnitudes of all variables for a given outlet boundary condition, i.e., y ) [Tout, Pout, d1, ..., dnco]T, where Tout and Pout are the temperature and pressure of the streams from the equipment item, respectively, di, i ) 1, nco, are flow rates of the species in the output stream, and nco is the number of chemical species in the stream. The output space can also accommodate calculated CFD model outputs, e.g., current for a fuel cell. To define the state space, we denote the state vector x in the state space, where each element of x represents a value of a state variable at one node of the mesh. Moreover, x ) [T, P, ux,...]T, where T, P, and ux are also column vectors representing temperature, pressure, and velocity components. Without loss of generality, we now denote x as a specific state variable, e.g., temperature, at all nodes of the mesh. We combine these vectors to form the state space model in the form of x˙ ) Λx + Πu y ) Γu

(7)

However, because our system is steady-state, then the state space model for steady-state process can be expressed in linear or nonlinear forms, respectively, as

[] [ ] [] [ ] x B x B(u) ) u or ) y D y D(u)

(8)

where the matrix B is a mapping that transforms u in the input space to x in the state space and the matrix D is a mapping that transfers u to y in the output space. Thus, we focus on identifying the two matrices B and D first, and then we can calculate the state x and output y using one of the forms of eq 8. Identifications of B and D are performed separately. A mapping can be implemented directly from u to D. On the other hand, we must decompose B into PCs and scores first and then implement mapping from u to the scores in PCA. 2.2.2. Order Reduction of State with PCA. Using the statespace definitions, we start with a set of data (SOD) of the vector field obtained from the CFD model for a series of n input cases. Each input case is characterized by a vector, Uj ∈ Rp defined in the input space, which contains all given information, such as boundary conditions, equipment parameters, operating conditions, and input information streams, e.g., defined from the Aspen Plus process simulator. We define an input matrix U ∈ Rp×n to store the set of input vectors. Values of U are bounded by domain Γ (in terms of an “operating window”), which defines the range of each variable in the input. Column Uj, j ) 1, p stores the values of all input variables for the jth CFD case. Extracting a set of snapshots of one specific state from the SOD, we can formulate a data set matrix, X ∈ Rm×n for the state. We assume that there are q states in the state vector field, m nodes in the mesh, and n responses for each state to the n

inputs. Therefore, each state matrix X ∈ Rm×n has elements xi,j, i ) 1, m that store the particular state value at the ith node corresponding to the jth input. We denote the columns Xj ∈ Rm, j ) 1, n as the “snapshot” for the jth input case. The total number of X matrices might be as large as q for all states in the model. We now apply SVD to each X. On the basis of the criterion in eq 6, we choose a rank r and represent reduced versions of X as X ≈ X(r) ) W(r)Σ(r)(V(r))T ) Ψr

(9)

where Ψ ) ∈ and r ) ∈ The matrix X is decomposed into its PC matrix Ψ ∈ Rm×r and score matrix r ∈ Rr×n. Considering that PCs and the scores are axes and coordinates, respectively, in the transformed system by PCA, we note that the PCs are unchanged for all inputs producing the SOD. Consequently, we can assume that this fact is also true for all of the inputs in the domain Γ. In other words, for any input Uj ∈ Rp restricted in the domain Γ, we need only find its score vector Rj ∈ Rr to calculate the response Xj ∈ Rm as linear combinations of the PCs, i.e. W(r)

Rm×r

Σ(r)(V(r))T

Rr×n.

Xj ) ΨRj

(10)

The only varying components in eq 10 are the score Rj obtained from PCA of the data set with potential relationships between Rj and input Uj. Additional functions between them can be built with techniques, such as regression, neural networks, etc., according to the complexity of the CFD model. Equivalently, this is the procedure that builds the mapping F(u):U f R(u) that allows us to calculate the state corresponding to any input instance, us, specified in the domain Γ, i.e.

[ ]

[ ][ ]

r F1 f v F2 f r Rs ) Rs(us) ) Fus ) l V r Fr f

v us V

(11)

where Fj, j ) 1, r is the row component of matrix F containing all coefficients of the input-to-score mapping. By substituting Rs into eq 10, we can obtain the state xs at all nodes of the mesh corresponding to input us as xs ) ΨRs ) ΨFus

(12)

Consequently, eq 12 is the expected ROM of the state, where PCs, Ψ(x) ∈ Rm×r and mapping F ∈ Rr×k are determined in advance, so that any given input, us, can be used to compute the corresponding state, xs directly. Thus, the development of the ROM for an equipment item is based on creating the mapping F:U f R from the CFD solutions, which leads to the mapping xROM:U f X for every state x. When eq 12 is compared to eq 8, it is obvious that the matrix B in eq 8 is decomposed into two matrices, i.e. B ) ΨF

(13)

Because the model is derived entirely from the SOD, there are no limitations because of geometric shape, field dimension, type of flow, material properties, chemical reactions, etc. in the CFD model. 2.2.3. Mapping with Neural Networks. As mentioned, PCAbased ROMs should have two functional capabilities: an input to calculate the variables in the state space with xROMs and an output space with yROM. Both of the spaces involve linear or nonlinear regression. In this research, we use the backpropagation neural network for complicated cases and linear regression for a simple example.

1698 Energy & Fuels, Vol. 23, 2009

Lang et al.

Figure 1. Scheme of a back-propagation neural network.

In 1957, Kolmogorov proved that a continuous function of n arguments can always be represented using a finite composition of unary functions and addition. In other words, Kolmogorov’s theorem states that any continuous function can be reproduced exactly by a finite network of computing units, whereby the necessary primitive functions for each node exist.7,8 Applied to neural networks, Demuth, Beale, and Hagan9 restate this property as “multiple-layer networks are quite powerful. For instance, a network of two layers, where the first layer contains sigmoids and the second layer is linear, can be trained to approximate any function (with a finite number of discontinuities) arbitrarily well”. To implement the mapping from the input to the ranked scores after PCA, we configure a two-layer neural network accordingly. The scheme for a typical back-propagation neural network with p inputs, s neurons in the hidden layer, and r targets in the output layer is shown in Figure 1. The active function in the hidden layer is a sigmoid, f, and purely linear in the output layer. With Ω ∈ Rr×n as the target and U ∈ Rp×n as the input, we train the neural network with either resilient back propagation or Bayesian regularization to get the following weighting parameters simultaneously: IW{1, 1} ∈ Rs×p b{1} ∈ Rs×1 LW{2, 1} ∈ Rs×r b{2} ∈ Rr×1

(14)

For this study, we use the tan sigmoid function given by f(x) ) tansig(ξ) ) 2/(1 + exp(-2 - ξ)) - 1

(15)

which maps the argument ξ ∈ R into the range -1 e f e 1. We therefore express the neural network mapping as

(∑

)

s

ak ) f

IWk,juj + b{1}k , k ) 1, s

j)1

(16)

s

Fi(u) )

∑ LW

i,kak + b{2}i,

i ) 1, r

(17)

k)1

where Fi(u), i ) 1, r is the mapping of the score for each rank i in eq 13. It is well-known that, even though a neural network can fit the data set well, overfitting or bad interpolation still remains a serious problem. We use training methods of resilient back propagation for xROM and Bayesian regularization for yROM. With the former, it is easy to converge to very low errors at (7) Sprecher, D. On the structure of continuous function of several variables. Trans. Am. Math. Soc. 1964, 115, 340–355. (8) Rojas, R. Neural NetworkssA Systematic Introduction; SpringerVerlag: Berlin, Germany, 1996. (9) Demuth, H.; Beale, M.; Hagan, M. Neural network toolbox. MATLAB, 2005.

designed points with manually adjusting the number of neurons in the hidden layer, so that the trained NNET poses a good regularization. With the latter, the method automatically trades off between the accuracies at the designed points and the regularization. The test and evaluation in the case studies indicate that the resulting xROM and yROM have a good approximation for both designed and unknown points. 2.2.4. PCA-Based ROM Algorithm. As a result, we can summarize the main steps of this methodology below: (1) Determine a domain, i.e., operating window, in the input space. (2) Implement an experimental design to obtain a set of n inputs within a predefined operating window for all p input variables. Formulate the inputs as a matrix U ∈ Rp×n as described above. (3) Solve the CFD cases one-by-one with the designed inputs under the defined mesh with the CFD simulator and formulate the SOD, including the output set Y. (4a) For xROM, execute a loop for each state, for j ) 1, q (q is the number of states to be considered). (1) Extract the jth state from the SOD. Formulate the data set X ∈ Rm×n for the jth state variable in the flow field, e.g., temperature, pressure, velocity, etc. (2) Implement SVD on X ) WΣVT. (3) According to required accuracy and the cutoff criterion in eq 6, select the reduced rank, r. (4) Obtain the ranked principal components Ψ ) W(r) and scores r ) Σ(r)(V(r))T. (5) Build the mapping U f R. (6) Formulate xjROM:u f X ) ΨR(u). (7) Next j. (4b) For yROM, with obtained Y and given U, implement the mapping U f Y and then derive yROM:u f y ) G(u). Eventually, we can augment the states and formulate the PCAbased ROM in the following state space model:

{ }

x1ROM l ROM ) xnsROM yROM

:

[ ] [ ] [ ][ ]

x B Ψ 0 F ) u) u (18) y D 0 I G

where F is defined in eq 12 and G results from the mapping of yROM. Similarly, for the nonlinear form of eq 13, we can express the ROM as

{ }

x1ROM l ROM ) xnsROM yROM

:

[ ] [ ] [ ][ ] x B(u) Ψ 0 F(u) ) ) y D(u) 0 I G(u)

(19)

2.2.5. IllustratiVe Example. To briefly illustrate the algorithm and the results of PCA-based ROM, we introduce a simple, 2D water flow in the manifold example as shown in Figure 2a. Here, there is an intersection view of a multiple-branch unit with a width of 100 mm. Water at 20 °C and 1 atm flows in at the top with a given velocity (input), and outlets are open at the ends of the two branches at a pressure of 1 atm. The state variables are indexed as follows: x1 is the velocity component in the x direction; x2 is the velocity component in the y direction; and x3 is the pressure. While the ROM methodology can, of course, be applied to an arbitrary geometry, we focus only on the shaded part of the manifold shown in Figure 2b, to simplify the illustration. In this example, inlet pressure is constant and the input space has only one dimension spanned by the input flow rate. We

ROM Based on PCA

Energy & Fuels, Vol. 23, 2009 1699

Figure 2. Geometry of the manifold: (a) overall geometry and (b) area considered for ROM derivation. Figure 4. Comparison of POD and PCA.

Figure 3. Comparison of ROM results with input |V| ) 0.4. The left column is from CFD, and the right column is from ROM. From top to bottom is X1, X2, and X3.

design the input variable range as 0.2 e |V| e 2.0 (m/s) and consider 10 cases for the ROM derivation. Each case i corresponds to input of |V| ) 0.2i, i ) 1, 10. The size of the example is a dimension of the input space p ) 1, number of cases n ) 10, number of mesh points in the flow field m ) 809, and number of states q ) 3. FLUENT was used to solve the CFD model of the example for the 10 cases to obtain the snapshot data set. PCA analysis shows that only two PCs are needed to capture 99% of the variances of the data set, and a linear input-output mapping is accurate enough to develop the ROM. Using the developed ROM in the form of eq 18, we compare the flow and pressure fields from the CFD model (left) with the ROM (right) in Figure 3 for a specified input. It is evident that the ROM works very well and closely approximates the high-fidelity CFD results. Moreover, the wall clock time for the CFD model (run in FLUENT) requires approximately 100 CPU seconds for one case of the manifold, while the ROM (evaluated in MATLAB) for the same case runs in less than a CPU second. 2.2.6. AdVantages of the Proposed PCA-Based Methodology oVer the Proper Orthogonal Decomposition (POD)

Method. In ROM research, there is also a popular method called POD, which is often applied in fluid dynamics.10-19 Both PCA and POD are synonyms of Karhunen-Loeve decomposition, and there are only a few conceptual differences between PCA and POD. However, there are some differences in the implementations as shown in Figure 4. First, POD is usually applied to dynamic models requiring the solution of partial differential algebraic equation (PDAE) systems involving spatial-temporal variables. On the other hand, the PCA-based approaches proposed in this paper are often used in time-independent or steady-state models, where the solution of the models depends upon spatial dimensions and an input space. This difference determines the different representation of the data set matrices. As mentioned in the previous section, the columns in the PCA data set are state responses to corresponding inputs, while the columns in the snapshot for POD are the values of the states in the spatial coordinates usually measured at sequential time points. Of course, the data reduction procedures of PCA and POD are similar, and the same methods can be used to solve corresponding eigenproblems. Nevertheless, it should be noted that PCA-based approaches can also be applied to dynamic models by considering time in the same manner as an additional spatial dimension. (10) Fahl, M. Trust-region methods for flow control based on reduced order modeling. Ph.D. Thesis, 2000. (11) Willcox, K. E. Reduced-order aerodynamic models for aeroelactic control of turbomachines. Ph.D. Thesis, Massachusetts Institute of Technology (MIT), Cambridge, MA, 2000. (12) Kragel, B. Streamline diffusion POD model in optimization. Ph.D. Thesis, 2005. (13) Cazemier, W. Proper orthogonal decomposition and low dimensional models for turbulent flow. Thesis, 1968. (14) Cizmas, P. G.; Palacios, A.; O’Brien, T.; Syamlal, M. Proper orthogonal decomposition of spatio-temporal patterns in fluidized beds. Chem. Eng. Sci. 2003, 58, 417–427. (15) Moehils, J.; Smith, T. R.; Holmes, P.; Faisst, H. Models for turbulent plane couette flow using the proper orthogonal decomposition. Phys. Fluid 2003, 14, 7. (16) Smith, T. R.; Moehlis, J.; Holmes, P. Low-dimensional model for turbulent plane couette flow in a minimal flow unit. J. Fluid Mech. 2005, 538, 71–110. (17) Kunisch, K.; Volkwein, S. Control of the Burgers equation by a reduced-order approach using proper orthogonal decomposition. J. Optimization Theory Appl. 1999, 102, 2. (18) Bizon, K.; Continillo, G.; Russo, L.; Smuła, J. On POD reduced models of tubular reactor with periodic regimes. Comput. Chem. Eng. 2008, 32, 1305–1315. (19) My-Ha, D.; Lim, K. M.; Khoo, B. C.; Willcox, K. Real-time optimization using proper orthogonal decomposition: Free surface shape prediction due to underwater bubble dynamics. Comput. Fluids 2007, 36, 499–512.

1700 Energy & Fuels, Vol. 23, 2009

Lang et al.

Figure 5. Two-dimensional geometry of a single axisymmetric combustor can and its mesh.

Second, the post-decomposition steps used to calculate the temporal coefficients or scores are quite different. In POD, we return to the original PDAE system to calculate time-dependent coefficients, Rj(t), using the decomposed form of the solution and Galerkin projection to convert the PDAE model into a set of ordinary differential equations (ODEs). The number of resulting ODEs is (ny × r), where ny is the number of the variables or states in the partial differential equation (PDE) and r is the rank of the POD modes. In the PCA-based approach, the PC coefficients are called scores that are determined only by the inputs used to construct the data set. It is often straightforward to find the mapping from the input to the scores. Third, in POD, the rank corresponds to a set of ODEs to be solved to obtain the coefficients Rj(t), j ) 1, r of the ROM in eq 2 and increasing the rank increases the complexity of the POD procedure significantly. Therefore, the variance captured by POD is limited, for example, usually e99.9%, which introduces more error in the ROM. However, in PCA-based ROM, a higher rank can be used without directly increasing complexity of the ROM derivation. Therefore, the variance captured or cutoff (Rr in eq 6) can be arbitrarily close to one. In this way, the error in this PCA step is usually negligible. Finally, validity of the POD modes is not guaranteed when the model inputs change, while the PCs still remain accurate when the input changes, as long as the data set is well-designed and well-sampled. We would like to point out that Rj(t), j ) 1, r in POD are related to the scores in PCA; therefore, they can be derived in a data-driven manner (e.g., with NNET) instead of equationdriven. In comparison to POD, PCA therefore has the following advantages because of the factors of data-driven models. These make PCA better suited to develop ROMs used in steady-state process simulation and optimization. (1) No Restriction of Geometry. Process equipment items often have complex geometries to meet demanding industrial applications. Therefore, the capability for a ROM approach to handle arbitrary equipment shapes is crucial. POD is usually limited to regular geometry. (2) AVoiding Complexity of the PDAE Model. The original PDAE model solved by the CFD simulator already involves many types of governing equations. In addition to the Navier-Stokes equations, there may be many additional PDEs and algebraic equations that account for multicomponent mass and energy balances, thermodynamic properties, and chemical reactions. Furthermore, additional complexity can arise from the dimensionality of 2D or 3D CFD models. Because the proposed methodology does not deal directly with PDAEs or ODEs, it avoids the complexities in equation-driven methods, such as POD. (3) Interpolation. Because the mapping and regularization from the input into the states and the outputs are introduced in the proposed method, it is possible that the xROM and yROM

estimate the variables well in the state and the output spaces, respectively, for a given point in the input space. This is generally difficult for POD. 3. Process Case Studies with CFD Models To illustrate the effectiveness of the PCA-based methodology, we consider two challenging engineering case studies taken from an advanced gasification-based power plant.20 The first is a 2D gas-turbine combustor, and the second is a 3D entrained-flow coal gasifier. 3.1. Gas-Turbine Combustor. In advanced gasificationbased power plants, gas-turbine combustors burn a mixture of synthesis gas (mainly carbon monoxide and hydrogen) and air to deliver high-temperature gases to the gas turbine for power generation. Such combustors involve complex turbulent and reacting flows requiring the use of high-fidelity CFD models to provide an accurate calculation of the inlet temperature for the gas turbine.20 3.1.1. CFD Model Description. The combustor model used here is based on a turbulent, lean-premixed, swirl-stabilized research combustor at NETL.21 The CFD model is scaled up to represent a single combustor can in a 250 MW, 16 combustor can gas turbine used for an advanced gasification-based power plant. As shown in Figure 5, the 2D axisymmetric combustor geometry has nearly 6000 computational cells, which are used to solve the CFD model in FLUENT.20 The flame is anchored by careful scaling and design of the combustor, so that the nozzle velocity of the NETL research combustor is matched. In the FLUENT model of the combustor, the finite rate/eddy dissipation model is selected.20 The reaction rate is defined by taking the minimum of the chemical reaction rate and the turbulent mixing rate. The overall reactions in the combustor are 2CO + O2 f 2CO2 2H2 + O2 f 2H2O CH4 + 2O2 f CO2 + 2H2O

(20)

To develop the ROM, we can vary the input variables (in terms of inlet stream information in Aspen Plus/APECS) that represent operating and boundary conditions in the CFD-based combustor model (run in FLUENT). These cases provide corresponding state fields and output stream(s). In this study, we consider only one state, temperature. Nevertheless, the methodology of (20) Zitney, S. E.; Osawe, M. O.; Collins, L.; Ferguson, E.; Sloan, D. G.; Fiveland, W. A.; Madsen, J. M. Advanced process co-simulation of the FutureGen power plant. Proceedings of the 31st International Technical Conference on Coal Utilization and Fuel Systems, Clearwater, FL, May 21-25, 2006. (21) Sidwell, T.; Casleton, K.; Straub, D.; Maloney, D.; Richards, G.; Strakey, P.; Ferguson, D.; Beer, S. Development and operation of a pressurized optically-accessible research combustor for simulation validation and fuel variability studies. Proceedings of ASME Turbo Expo, 2005; paper GT2005-68752.

ROM Based on PCA

Energy & Fuels, Vol. 23, 2009 1701

Figure 6. Typical temperature field: (a) entire geometry and (b) enlargement of the interesting portion of the field. Table 1. Modified Ranges of Independent Design Variables minimum _u2 _u3 fuel _u6 _u7 _u5 air _u6 _u1 phys _u8 _u9

molar fraction of CO2 molar fraction of CO molar fraction of H2O molar fraction of CH4 molar fraction of CO + H2 R2 ) O2/(0.5(CO + H2) + 2CH4 operation pressure (MPa) inlet mass flow rate (kg/s) inlet temperature (K)

0.01 0.02 0.15 0.02 0.5 0.98 1.4 15.0 300

base 0.026 0.035 0.211 0.037 0.565 1.85 20.5 659

maximum 0.05 0.04 0.25 0.05 0.65 5.50 2.2 25.0 1000

deriving the ROM for temperature can be applied directly to any other state without any restriction. Figure 6 shows a typical temperature field displayed as filled contours. From the entire temperature field (Figure 6a), we immediately find that the temperature changes significantly only near the inlet in the left portion of the geometry, while in the rest of the geometry, the temperature remains uniform. To display the results easily, we focus only on the rectangle shown in Figure 6b, excluding the nozzle part. With the database obtained from running the CFD model in the area of interest, we implement PCA using SVD in Matlab. From the data set, the ranked PCA can capture 99.999% of the energy of the 128 cases, with only 35 principal components to develop the ROM. 3.1.1. Experimental Design. Because the combustor model has multiple inputs, we need to minimize the number and size of the database used for development of the ROM and maximize the information covered by the database. Here, it is necessary to invoke an experimental design to obtain a set of inputs to cover the input domain and to keep the number of designed points to a minimum. There are several methods of experimental design available in literature. For example, response surface methodology was first developed by Box and Wilson.22 Further Box and Draper23 extended the method to create evolutionary operation. Differing from these, orthogonal arrays (often called the Taguchi method),24 LHS,25 and numeric theoretic methods26 provide a set of appropriate sample points before conducting experiments. In this study, we chose the LHS method with input domain listed in Table 1, along with their base values from Zitney et al.20 3.1.2. Build ROM. It is straightforward to perform a batch of CFD simulations for the designed input set. For each simulation, a case is established by extracting the temperature (22) Box, G. E. P.; Wilson, K. B. On the experimental attainment of optimum conditions. J. R. Stat. Soc., Ser. B 1951, 13, 1–45. (23) Box, G. E. P.; Draper, N. R. EVolutionary Operation; John Wiley: New York, 1969. (24) Hedayat, A. S.; Sloane, N. J. A.; Stufken, J. Orthogonal Arrays: Theory and Applications; Springer-Verlag: New York, 1999. (25) Patterson, H. D. The errors of lattice sampling. J. R. Stat. Soc., Ser. B 1954, 16, 140–149. (26) Hua, L. K.; Wang, Y. Applications of Number Theory to Numerical Analysis; Springer-Verlag: Berlin, Germany, 1981.

Figure 7. Comparisons for case 36 of the combustor example. Temperature contours are compared between the original CFD model, the temperature field reduced by PCA, and the ROM generated from the PCA field.

field and integrated (mass average weighted) outlet temperature. When all 128 cases are solved, we obtain a database matrix with m ) 5996 (number of mesh points) rows and n ) 128 columns; each column of the matrix corresponds to the temperature field of one case. Also, we obtain a vector where each entry represents the outlet temperature of the corresponding case. As we mentioned above, because most of the temperature field is uniform, we consider only the area of interest in Figure 6b for our investigation and development of the ROM. The size of this case is p ) 9, m ) 2601, n ) 128, q ) 1, and r ) 35. 3.1.3. EValuation of ROM. To evaluate the ROM thoroughly, two different comparisons are made. First, we test the predictive accuracy of the ROM at the LHS designed points. Second, we evaluate the approximation quality of the ROM at random points within the designed domain in the nine-dimensional input space. We now have three data sets obtained by CFD simulation, PCA approximation, and the derived ROM. Each set contains 128 cases with different operating and boundary conditions. We denote the resulting CFD temperature fields as true values of the combustor model, written as TFtrue. Similarly TFPCA represents the fields approximated by the ranked PCA, and TFROM represents the field calculated with the derived ROM. out and T out Similarly, T true ROM denote the output temperatures calculated by the CFD and ROM, respectively. We plot the three temperature fields in filled and unfilled contours using data in TFtrue(i), TFPCA(i), and TFROM(i), respectively, corresponding to case i. An example is shown in Figure 7. When we compare these plots one-by-one, we see that, in some of them, there is little observable distortion between TFtrue and TFPCA, even though most of them have no distinguishable differences. It is possible to compare the outlet temperatures obtained by CFD and ROM quantitatively. The magnitude of the differences is 10-8 to 10-9. Therefore, we have TFROM(i) ) TFPCA(i) ≈ TFtrue(i), i ) 1, 128 Because the error is so small, we can state that qualitatively out out TROM (j) ) Ttrue (j), j ) 1, 128

Summarizing the above, we are confident that the ROM is an accurate approximation of the CFD model at all of the 128 designed data points.

1702 Energy & Fuels, Vol. 23, 2009

Lang et al.

Figure 8. Entrained-flow gasifier and its typical temperature profile (K).

We also test the ROM at a set of 45 random points in the input space u ∈ Rp, p ) 9. Each test point is a vector Vc in the space u, which is a linear combination of a pair of vectors V1 and V2 in u, which are randomly selected among 128 designed vectors in the space u Vc ) wV1 + (1 - w)V2

(21)

where 0 < w < 1 is a random weight. Using the 45 produced vectors, Vc values, we obtain calculated responses of the temperature field and the outlet temperatures using the derived ROM and FLUENT, respectively. When the results are compared by plotting them with contours, we find that the temperature fields are approximated with less error when the random point is in the proximity of a LHS design point. For a few test points that are far away from the existing points, the approximation of the field is significantly different. Nevertheless, the errors in the combustor outlet temperatures are less than 3.0% for all 45 test points; these are satisfactory for most of the cases and are important for iterative process co-simulation and optimization, because only the ROM output is used to obtain the output streams required in APECS. In Section 4, we will discuss future work to obtain better approximations of the state field with ROMs. In this case study, a significant CPU time saving is observed with reasonable accuracy when comparing the PCA-based ROM to the rigorous CFD model. The average CPU time for the CFD model (run in FLUENT) is approximately 2000 CPU seconds, while for the ROM, it is less than a CPU second. We also conclude that, by using neural networks as the nonlinear regression tool, there is no doubt that the ROM is well-matched at the designed points. On the other hand, it is important to overcome the overfitting during training of the neural network and to reduce interpolation errors at unknown points. We consider this issue in the next case study. 3.2. Entrained-Flow Coal Gasifier. Gasification technology is a key component of today’s integrated gasification combined cycle (IGCC) power plants and is expected to be the centerpiece of tomorrow’s high-efficiency, zero-emission systems. The gasifier provides a means for converting coal to a hydrogenrich synthesis gas, ideally suited for power generation, refining, and chemical applications. To realize the full potential of this promising new technology, researchers are using CFD modeling to better understand the complex physical and chemical phenomena, including fluid flow, heat and mass transfer, and chemical reactions, that impact gasifier performance and efficiency. These high-fidelity CFD models are also used within overall process simulations for improving the design, analysis, and optimization of gasification-based power plants.

Figure 9. Central section of the finite element mesh for the gasifier and area of interest for the vector field. Table 2. Input Domain of the Gasifier u1 u2 u3

lower bound

Shi’s base value

upper bound

70 25 0.75

78 30 0.817

85 35 0.85

The entrained-flow, coal-slurry gasifier considered here is a two-stage, upflow gasifier consisting of a horizontal first stage and a vertical second stage, as shown in Figure 8. All of the oxidant and 78% of the coal slurry are evenly divided between the left- and right-hand inlets of the first stage. This horizontal stage is mainly a coal combustor and provides hot gases through the connection to the second stage, in which the remaining 22% of the coal slurry is injected. Most of the coal gasification process occurs in the second stage. The total volume of the gasifier is 45.5 m3. The particle volume fraction is estimated to be around 4%, and the average particle residence time is estimated to be 10 s. The operating pressure is 28 atm. The coal slurry and the oxygen are fed into the gasifier at temperatures of 450 K and 411.4 K, respectively. It is important to note here that this is a prototype gasifier design, which is not intended to represent any existing gasifier designs, commercial or otherwise. 3.2.1. Model Description. Gasifier modeling has received a lot of focus over the past 30 years, primarily because of its complexity and the potential of gasification as a source of energy

ROM Based on PCA

Energy & Fuels, Vol. 23, 2009 1703

Figure 10. Contour plots of seven variables for case 29. The left figure in each set is the FLUENT result, while the right figure is the ROM prediction. Table 3. Summary of ARE and maxRE for xROM and yROM of Seven Variables in All 28 Cases index state (xROM)

1 H2S

2 H2O

3 H2

4 CH4

5 CO2

6 CO

7 temperature

ARE (×10-5) maxRE (×10-4)

2.162 1.703

3.412 3.579

2.442 2.609

2.553 2.444

2.330 3.622

1.158 1.346

2.760 4.323

2.402

output (yROM)

H2S

H2 O

H2

CH4

CO2

CO

temperature

ARE (×10-9) maxRE (×10-9)

1.144 3.061

2.617 9.801

2.318 6.143

4.014 9.259

10.567 26.821

1.103 2.922

0.699 2.181

Table 4. Statistical Performance of yROM Interpolations index

1

2

3

4

5

6

7

output

H2 S

H2 O

H2

CH4

CO2

CO

temperature

relative error (%)