Sequential-Optimization-Based Framework for Robust Modeling and

Nov 9, 2017 - Using matix calculus, we get (25) (26) (27) (28)Equations 25–28 are solved for each experimental condition in the set C. Further, it s...
0 downloads 8 Views 3MB Size
This is an open access article published under an ACS AuthorChoice License, which permits copying and redistribution of the article or any adaptations for non-commercial purposes.

Article Cite This: J. Phys. Chem. C XXXX, XXX, XXX-XXX

pubs.acs.org/JPCC

Sequential-Optimization-Based Framework for Robust Modeling and Design of Heterogeneous Catalytic Systems Srinivas Rangarajan, Christos T. Maravelias,* and Manos Mavrikakis* Department of Chemical & Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States S Supporting Information *

ABSTRACT: We present a general optimization-based framework for (i) ab initio and experimental data driven mechanistic modeling and (ii) optimal catalyst design of heterogeneous catalytic systems. Both cases are formulated as a nonlinear optimization problem that is subject to a mean-field microkinetic model and thermodynamic consistency requirements as constraints, for which we seek sparse solutions through a ridge (L2 regularization) penalty. The solution procedure involves an iterative sequence of forward simulation of the differential algebraic equations pertaining to the microkinetic model using a numerical tool capable of handling stiff systems, sensitivity calculations using linear algebra, and gradient-based nonlinear optimization. A multistart approach is used to explore the solution space, and a hierarchical clustering procedure is implemented for statistically classifying potentially competing solutions. An example of methanol synthesis through hydrogenation of CO and CO2 on a Cu-based catalyst is used to illustrate the framework. The framework is fast, is robust, and can be used to comprehensively explore the model solution and design space of any heterogeneous catalytic system.



INTRODUCTION Computational analysis is a key step in mechanism elucidation of a catalytic system and subsequent design of improved catalysts.1−3 In this context, it typically includes (a) evaluating the energetics of the species and transition states of elementary reaction steps using a computational chemistry method such as density functional theory (DFT) or simple linear correlations derived thereof once a potential reaction mechanism is identified and (b) relating these microscopic properties to macroscopic quantities such as reaction turnover frequency (TOF), selectivity, etc., through mean-field microkinetic models or kinetic Monte Carlo simulations.4−12 A microkinetic model based on kinetic and thermochemical quantities derived from DFT, however, may not match with experimental data due to (1) inaccuracies in the estimation of the energetics using DFT, (2) inadequate catalyst models used in DFT studies which are not representative of the surface environment (e.g., clean vs covered by reactants, products, or spectators), or (3) incorrect active sites being considered. The first issue has been recently addressed by estimating the errors in a DFT calculation and propagating them through with a microkinetic model to assess prediction errors.13−15 Therefore, the model can, in principle, predict reaction rates, selectivity, orders, etc., and quantify the reliability of such predictions. Incorporating experimental data, on the other hand, allows for developing a more accurate model, albeit postdictively. This requires parameter estimation to reconcile model-experiment mismatches and identify a microkinetic model solution that is consistent with DFT model assumptions and calculations, © XXXX American Chemical Society

thereby elucidating the dominant reaction pathway, ratedetermining steps, active sites, etc.; in this context, it should be noted that this estimation step can specifically identify the required deviations from DFT-derived energetics, allowing for a direct comparison of these values with statistical errors of DFT. A workflow to obtain such self-consistent solutions was described by Singh et al.16 Further, identifying the “ideal” catalyst properties in terms of the thermochemistry and kinetic parameters of the catalytic reaction system that maximize the desired observables such as rate, selectivity, conversion, etc., provides a target for computational design of improved catalysts. In this case too, the solutions can be inferred as required deviations from DFT-derived energetics for a reference catalyst to identify which parameters to improve, and by how much. Parameter estimation and catalyst design are both nonlinear optimization problems17−24 in that an objective (linear or nonlinear) needs to be minimized or maximized subject to (non)linear constraints that capture the catalytic system (i.e., the underlying microkinetic model). A traditional approach to solving parameter estimation problems is the so-called sequential approach,25−27 wherein the optimizer obtains the information it needs regarding the reaction rate, surface coverages, and corresponding gradients by seeking a forward simulation of the microkinetic model at each step of the optimization process. Challenges with such an approach typically revolve around Received: August 13, 2017 Revised: October 8, 2017

A

DOI: 10.1021/acs.jpcc.7b08089 J. Phys. Chem. C XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry C dθi = dt

issues related to the stiffness of the microkinetic model and the ability of the optimizer to treat such highly nonlinear systems. Consequently, there has been significant interest in developing fast and robust methods to solve these classes of problems to comprehensively explore the solution space. Previously, Rubert-Nason et al.28 adapted an alternative method, referred to as the “simultaneous approach”,25,29−33 to show that parameter estimation for heterogeneous catalytic systems can be reformulated as a nonlinear programming problem (NLP), whereby the microkinetic model is recast as a system of nonlinear algebraic equations under steady state conditions. Such a formulation can also be used in the context of catalyst design. This reformulation offers a means for fast and comprehensive evaluation of the solution space of large optimization problems. However, several challenges arise within this approach. First, nonlinear algebraic equations can have multiple solutions and the NLP solver could potentially identify steady state solutions unattainable from physically realizable initial conditions; successful identification of a plausible solution is then numerically predicated on the initial guess provided to the solver. Further, when time variation of the microkinetic model is sought, the NLP reformulation has to be extended to account for the dynamics through methods such as orthogonal collocation; in such cases, if the microkinetic model is stiff, small integration step sizes are required, thereby increasing the number of variables in the resultant optimization problem. In contrast, the sequential approach offers advantages such as ease of implementation, a time evolution of the states of the system from initial conditions, thereby leading to physically realizable steady state solutions, and a generally smaller overall optimization problem in terms of the number of variables involved. Further, the sequential approach can be made robust and fast upon using state-of-the-art optimizers and numerical solvers, thereby partially mitigating the drawbacks. In this paper, we present a generic sequential optimization framework for solving parameter estimation and catalyst design problems in heterogeneous catalysis. The novelty of the proposed framework is that it allows for (a) comprehensively exploring the plausible set of deviations from DFT-derived energetics that explains the experimental observations in a robust manner, (b) comparing and contrasting these solutions through automated data analysis techniques that allow for hypothesizing competing dominant chemistries, (c) proposing target energetics to maximize the desired performance of a catalyst, and (d) specifically identifying sparse solutions that prevent overfitting by picking the smallest number of parameters that need to be modified. The paper begins with a description of the framework and continues with illustrative results in parameter estimation and catalyst design using CO and CO2 hydrogenation to methanol as an example.

1=

j∈J

∑ θidi + θ 0

rj = k for, j

(3)

∏ i ∈ I jRG

[Pi ]|νij|

ij

0, + j |

∏ [θi]|ν | [θ 0]|ν ij

i ∈ I jRS

0, − j |

∏ [θi]|ν | [θ 0]|ν

− k rev, j

∏ i ∈ I jPG

[Pi ]|νij|

∀j ∈ J

i ∈ I jPS

Pi = Ptot

(4)

Fi ∑G Fl

∀ i ∈ IG (5)

where i and j are the subscripts over species I and reactions J in the network, respectively; Fi is the flow of gaseous species; θi is the surface coverage; di is the denticity, or the number of sites occupied by the intermediate; θ0 is the free site coverage; kfor,j and krev,j are forward and reverse reaction rate constants of a given reaction (whose expressions are defined later); νij is the stoichiometric coefficient of gaseous/surface species i in the reaction j; νj0, + / − are the stoichiometric coefficients of the free sites involved in the reactants/products side of the reaction, respectively; IG and IS are subsets comprised of species in the gas phase and surface intermediates, respectively; I jRG , I JRS, I jPG , and I JPS respectively refer to the set of gaseous reactants, surface intermediate reactants, gaseous products, and surface intermediate products in reaction j; Pi is the partial pressure of gaseous species i; and Ptot is the total reactor pressure. The single site model written above can be extended to a multisite model in a straightforward manner. More generally, the dynamic CSTR can be written as dx = f (x , k , u) dt

(6)

g (x , k , u) = 0

(7)

y = Bx(t → ∞);

x(0) = x0(u);

u = {Fin , Ptot , T } (8)

where x is the set of flow rates and surface coverages, u is a vector of inputs and conditions (inlet flow rates, reaction pressure Ptot, and temperature T), y is the linearly transformed steady state solution of x, whose transformation is typically a (usually nonsquare) matrix B of zeroes and ones (to pick a subset of the components of x), and k is the vector of the kinetic rate constants (forward and reverse) and is a function of u (specifically of temperature, as discussed later). This set of differential algebraic equations can be solved using a linear implicit multistep backward difference34−36 method using an analytical (or numerically derived) Jacobian. Optimization. The optimization problemparameter estimation or catalyst designcan be formulated as below

OPTIMIZATION FRAMEWORK: METHODS Microkinetic Model. A continuous stirred tank reactor (CSTR) model is a reasonable choice of catalytic reactors operating at low per pass conversion ( 0

ΔHj =

k for, j = Kjk rev, j

̂

(11)

Kj = e−ΔHj / RT eΔSj / R = K 0je−ΔHj / RT

∀j ∈ J

DFT HTS, j = HTS, j + ΔE TS, j

(18)

∀i ∈ I

(19)

(20a)

∀j ∈ J

(20b)

where HTS,j is the temperature-dependent enthalpy of the transition state of a reaction j, Ea,DFT is the activation barrier j DFT calculated using DFT, HTS, j is the DFT-calculated enthalpy of the transition state, ΔEa,j is the deviation of activation barrier from the DFT value, and ΔETS,j is the deviation of the energy of the transition state. The activation barrier of a step is equal to the difference in the enthalpy of the transition state and the sum of enthalpy of the reactants. The set of constants p is comprised of k0, K0 (temperature independent terms), the DFT Shomate parameters of HDFT, and EaDFT (or HTS ). The decision variables, π, of the parameter estimation problem then are comprised of ΔEa (ΔETS) and ΔBE. For the catalyst design problem, on the other hand, the decision variables can be the deviations from DFT energies, or the energies obtained from the parameter estimation problem. In the latter case, for instance, the enthalpy of each species (19) would be written as

(13)

∀j ∈ J

∑ νijHi

Ea, j = Ea,DFT + ΔEa, j j

(12)

∀j ∈ J

(17)

where Hi is the enthalpy of the species i, HiDFT is the enthalpy calculated using DFT (or any computational chemistry method), ΔBEi is the binding energy differential from the DFT value, and Ij is the set of species in reaction j. It should be noted that the enthalpy is dependent on temperature and pressure; hence, it is dependent on the experimental condition c. The temperature dependence can be obtained from computational chemistry methods in the form of polynomial expressions (known as Shomate equations, the detailed derivation of which is given in Grabow and Mavrikakis37). The activation barrier of a reaction step can be written as

i* ∈ I G ⊆ I G

∀j ∈ J

(16)

∀ n ∈ {1, 2, ..., N }

Hi = HiDFT + ΔBEi

The experimental data is represented by y.̅ The objective to be maximized or minimized is subject to constraints corresponding to the microkinetic model (eqs 9a−c) and additional constraints (eqs 9d and e), wherein p is the set of constants and π is the set of decision variables (changed during optimization), and h is a set of inequalities that essentially ensures thermodynamic feasibility and bounds on the decision variables. It should be noted that the terms “parameters” and “decision variables” have been used interchangeably here; while “decision variables” is the technically correct terminology, “parameters” is expected to be better recognized in the catalysis community. The kinetics of the reaction system (the function ψ) can be written as k for, j = k 0je−Ea,j / RT

∀j ∈ J

i ∈ Ij

(10)



(15)

where πLB and πUB are the upper and lower bounds on the decision variables, respectively (for instance, they could specify the allowed range for the deviations in binding energy of all species). Additionally, a constraint that the binding energy of any species is always negative can be specified; however, this constraint is not a strict requirement. It should be noted that the heat of reaction can be written in terms of the binding energy of the species as

as constraints which must be satisfied for each experimental condition c ∈ C, the set of all experimental conditions. Here, x, y, and u are dependent on the experimental condition c and hence are referenced as xc, yc, and uc, respectively. In principle, the catalyst design objective could include multiple conditions, and therefore, we have included a subscript c and a summation. We drop subscript “c” in several places in the following analysis for brevity; in these cases, the mathematical treatment extends to all experimental conditions. Further, from eqs 9a−e, all terms, except for constants, are dependent on the experimental conditions. ΖParam.Est is the weighted least-squares error representing the model−experiment mismatch, ZCat.Des is the objective to maximize turnover frequency (thereby outflow yi of a product), the conversion of a reactant χi ′, or the selectivity of ̂ a product σi * within a specific subset of gaseous species (IG). Mathematically, conversion and selectivity are respectively defined as χi ′ =

∀j ∈ J

(14)

k0j is possibly a temperature-dependent factor and can be one of the following: (a) Arrhenius-type constant pre-exponential factor, (b) collision frequency that is inversely proportional to C

DOI: 10.1021/acs.jpcc.7b08089 J. Phys. Chem. C XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry C

where ϒmn is the (m, n)th element of the matrix. This can be obtained by differentiating the microkinetic model (6−8) with respect to π. Using matix calculus, we get

Hi = Hiref + ΔBEi , where Hiref is the enthalpy from the primary solution which will be used now as the reference, or nominal, value. The number of parameters can be large; therefore, overfitting the model to the data is of concern. To overcome this, we seek sparse solutions; that is, we seek solutions that require as few parameters to be nonzero as possible. From a physical chemistry standpoint, we seek solutions that minimally deviate from DFT-derived energetics. A computationally efficient way of identifying sparse solutions is through adding regularization terms, which are directly dependent on the magnitude of the parameter values (first/second norm, or a combination), as a penalty to the objective.38−40 Insignificant parameters will be forced by the optimizer to be zero, and only a subset of parameters will take nonzero values. While such regularization terms are added often in linear regression problems, herein, we adopt the strategy for our nonlinear parameter estimation objective by including a penalty for changing decision variables from their nominal values πnom. This term, typically referred to as a ridge or L2 regularization penalty, is given by the following summation

:=

(21)

⎛ dZ ⎞T N = ⎜ Param.Est ⎟ ⎝ ⎠ dπ ⎛ dyc ⎞T ⎟ W (yc − yc ) + 2δ(π − πnom) ⎝ dπ ⎠

∑ 2⎜

or T ⎛ dZCat.Des ⎞T ⎛ dyi ⎞ ⎜ ⎟ =⎜ ⎟ N= ⎝ dπ ⎠ ⎝ dπ ⎠

or

⎛ dσi * ⎞T ⎜ ⎟ ⎝ dπ ⎠

or

(26)

⎛ dg ⎞T ⎛ dg ⎞T 0=⎜ ⎟ :+⎜ ⎟ ⎝ dx ⎠ ⎝ dπ ⎠

(27)

⎛ dy ⎞T ⎜ ⎟ = B:(t → ∞) ⎝ dπ ⎠

(28)

d: dt

→ 0, t → ∞; the sensitivities

dy T dπ

( )

can therefore

be calculated by solving the set of linear algebraic eqs 25 and 26 for : . This simplification speeds up the evaluation of the sensitivity values relative to the standard method of augmenting eqs 6−8 with eqs 26 and 27 and solving the complete system together.41−44 For the catalyst design problem, the Hessian can either be calculated on the basis of the second derivative or by using the Broyden Fletcher Goldfarb Shanno (BFGS) Hessian updates.45 Insensitive decision variables render the gradient and Hessian ill-conditioned such that the eigenspectrum of the latter varies over several orders of magnitude;46,47 this results in solutions that may contain arbitrarily large values for these variables. Mathematically, the penalty terms in eq 21 introduce a boundedness requirement on the decision variables and condition the Hessian to reduce the span of this eigenspectrum. Sequential Optimization Procedure. Figure 1 shows the three modules of the scheme and the associated information flow adopted in this work to solve optimization problems with an embedded differential algebraic system of equations, typically referred to as the sequential approach.25 In each gradientbased iteration step of optimization, the differential algebraic set of equations is solved and sensitivities (and, thereby, the gradient and the Hessian) calculated in the DAE solver and

N = |π |

where |π| is the size of the vector π and δ is a constant prechosen to modulate the contribution of the penalty terms to the overall objective. It should be noted that πnom is set to zero if the parameters represent deviations from DFT derived values. In principle, the catalyst design objective could also include penalty terms. Gradient Calculation through Evaluation of Sensitivities. A gradient based method to solve nonlinear optimization problems requires the gradient and the Hessian of the objective. The gradient vector N is the first derivative of the objective with respect to the decision variables and can be written as

c∈C

⎛ df ⎞T ⎛ df ⎞T d: ⎜ ⎟ :+⎜ ⎟ = ⎝ dx ⎠ ⎝ dπ ⎠ dt

CSTR,

n=1

=

(25)

Equations 25−28 are solved for each experimental condition in the set C. Further, it should be noted that, for a steady state

N

δ(π − πnom)T (π − πnom) = δ ∑ (πn − πnom, n)2 ;

⎛ dx ⎞T ⎜ ⎟ ⎝ dπ ⎠

⎛ dχi ′ ⎞T ⎜ ⎟ ⎝ dπ ⎠ (22)

For small deviations from the optimal point, the matrix of second derivatives of the objective with respect to the decision variables, that is, the Hessian / , for a parameter estimation problem can be approximated as6 /≈

∑ c∈C

⎛ dy ⎞T W ⎜ c ⎟ + 2δ dπ ⎝ dπ ⎠ dyc

Figure 1. Schematic of the information flow in the sequential method for solving nonlinear optimization problems with an embedded system of DAEs. The flow of decision variables π, objective Z, state profiles y,

(23)

The gradients and Hessians, in turn, require the calculation of sensitivities matically,

dy dπ

dy T dπ

(subscript “c” dropped for convenience). Mathe-

x(t), and sensitivities

have been marked explicitly. Inputs to the

optimization block are constants p such as the DFT-derived kinetics and thermochemistry, initial guesses of decision variables π0, reactor inputs u such as the temperature, pressure, and feed composition, and experimental data y,̅ while the outputs are the optimal values of the decision variables π and model predictions y.

is a matrix defined, on the basis of matrix calculus, as

⎡ dy ⎤ dy = [ϒmn] = ⎢ m ⎥ dπ ⎣ dπn ⎦

dy T dπ

( )

(24) D

DOI: 10.1021/acs.jpcc.7b08089 J. Phys. Chem. C XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry C

Figure 2. Workflow of the multistart method for identifying alternative solutions for the nonlinear optimization problem.The Latin hypercube sampling block spawns “L” different initial values of decision variables that are input into L different optimization runs.

clustering analysis can be carried out in a hierarchical manner to indicate which solutions are more similar to each other than to others. Hierarchical clustering strategy is commonly used to classify and organize large amounts of data, and dendrograms offer a natural representation of hierarchical clustering.58 Second, the similarities and differences between solutions can further be analyzed (by visual inspection) by using a heat map wherein an array of solutions (rows) and parameter values (columns) is generated and color coded according to the magnitude of the latter. This allows for the user to (a) compare any two solutions, for similar solutions would have an identical column of colors, (b) identify parameters having the largest (smallest) magnitude of deviation from DFT in a given solution, and (c) assess the variation of parameter values across different solutions, for a homogeneously (heterogeneously) colored row of the array would signify that the parameter is modified to the same (different) extent. Software Implementation Details. The sequential optimization and multistart workflow has been implemented in C++ using state-of-the-art software. Specifically, IPOPT45 is used for optimization (using the default MUMPS linear algebra solver), SUNDIALS for DAE solution,59 EIGEN60 for linear algebraic evaluations of the sensitivities, open source codes for Latin hypercube sampling,61 and the statistical analysis software R62 for clustering analysis and heat map generation. The L optimization runs are carried out in parallel because they are essentially independent of each other. A generic input scheme (shown in the Supporting Information) has also been developed for parsing reaction system information unambiguously.

sensitivity analysis modules, respectively, using the values of the decision variables calculated by the optimizer at that step. The objective value and the sensitivities at that step are then supplied back to the optimizer. This loop terminates when the optimizer reaches a desired tolerance. Multistart Optimization. Nonlinear optimization problems can have multiple local minima; to explore all alternative solutions, ideally, all local minima in the parameter space need to be identified. Several methods48 have been proposed to this end, including hybrid/scatter searches49−51 and global optimization.52−55 We adopt a multistart approach whereby several initial values are chosen in the parameter space from which independent optimization runs are carried out. Although this does not rigorously ensure that all minima have been found, a multistart approach offers a systematic means to explore the parameter space comprehensively. Figure 2 shows the workflow of our proposed strategy. The set of information needed reaction mechanism, ab initio energetics information (from which constants, such as DFT-derived nominal values of kinetics and thermochemistry, can be derived), initial and operating conditions (or inputs u), and experimental data (to obtain y) ̅ is to be specified initially as inputs. Using this information, a number of optimization runs are spawned with different initial guesses π0 for the decision variables (π0,1−L); to scan the entire parameter space uniformly, a Latin hypercube (LHC) sampling procedure is used that has been shown to be statistically more efficient for selecting values of input variables than random sampling.56,57 Ideally, parallel optimization runs can be spawned with these “L” sampled points as initial guesses; however, it is likely that these points do not satisfy the thermodynamic consistency checks. To rectify this, we solve an optimization problem (not explicitly shown in the figure) where the objective is to move the sampled point minimally so that all original inequalities are satisfied. Mathematically, the objective of this new optimization preprocessing step is given by



METHANOL SYNTHESIS CO and CO2 can be hydrogenated to methanol on Cu/ZnO/ Al2O3 heterogeneous catalysts,63,64 which represents the stateof-the-art industrial methanol production technology. Methanol production is one of the most widely used technologies in the chemical, fuel, and energy industries, with an annual global footprint of more than 50 million metric tons of production, $50 billion in revenue, and 7.5 × 1017 J of energy consumption.65,66 This system has been extensively explored experimentally and computationally using DFT.67−82 Previously, Grabow and Mavrikakis37 presented a 49-step reaction mechanism of this system and performed detailed DFT calculations on Cu(111) to obtain species thermochemistry and reaction kinetics. They further showed that, to fit a microkinetic model to experimental kinetics data,67 the energetics obtained from DFT on Cu(111) had to be significantly altered (by more than 50 kJ/mol for the binding of several species and activation of reaction steps), indicating that a defect site such as the step edges on the catalyst particle on a Cu facet more open than Cu(111) could be the location of the dominant active site. Subsequently, Rubert-Nason et al.28

Ζ LHC = min(π − πLHC)T (π − πLHC) =

∑ (πn − πLHC, n)2 n∈N

(29)

subject to eqs 9d and 9e as constraints. The solution to this problem is then taken as the input to the main optimization problem. Further, it is likely that many of the starting points reach the same local optimum. The solutions to these L optimization runs are, therefore, gathered to identify similarities and differences between the solutions in a postprocessing step. We adopt two approaches to this end. First, given a set of solutions, it is possible to group together, or cluster, these solutions depending on how close (or similar) they are on the basis of a chosen metric. Specifically, such a E

DOI: 10.1021/acs.jpcc.7b08089 J. Phys. Chem. C XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry C

obtained by Rubert-Nason et al.28 (0.34) and Grabow and Mavrikakis37 (0.7) for the same data set. An important statistical quality of a fit is that it is unbiased; that is, the model does not consistently over(under-)predict any subset of the experimental data. We therefore performed the Anderson−Darling test83−85 to check if the errors (mismatch between model predictions and experimental data) follow a normal distribution. The details are given in the Supporting Information; our model is unbiased with statistical significance for methanol turnover frequency. A similar analysis for water turnover frequency showed deviation from normality; however, this was found to be primarily caused by a few points with large model−experimental mismatches. This solution, similar to that presented by Grabow and Mavrikakis,37 proposes significant additional binding of the surface intermediates (by up to 60 kJ/mol) and reduction of activation barriers of certain reaction steps (by up to 80 kJ/mol) relative to DFT predictions derived for Cu(111). This large extent of deviation suggests that the discrepancy does not arise from errors in DFT alone and that the active site is different from Cu(111); the nature of the deviations (stronger binding of the surface intermediates relative to DFT-determined binding energies) suggests that the active site is likely undercoordinated compared to Cu(111) sites. This is consistent with recent studies that suggest that zinc-decorated step edges of Cu(211) could be the active site.80 Case studies were undertaken to ensure that the total coverage of intermediates remained less than 0.25 monolayer (which represents twice the minimum coverage considered in the 3 × 3 slab used in the DFT calculations by Grabow and Mavrikakis37); however, these always led to worse solutions in terms of the final objective (typically by more than a factor of 2); discussing these studies is outside the scope of this work. Figure 4 shows the reaction network under representative industrial conditions of 75 atm and 528 K for a feed CO:CO2:H2 molar ratio of 0.1:0.1:0.8 with a total carbon inlet flow rate of 1 (mol C) (mol sites)−1 s−1 as predicted with parameters determined for the primary solution. The predicted methanol turnover frequency is 0.05 s−1. The width of the arrows is a qualitative indicator of the magnitude of the reaction flux. The coverages of surface intermediate species are color coded. Table 1 contains the degree of rate control86−88 (XRC) of methanol synthesis for the most rate-controlling steps and their flux values, while Table 2 gives the coverages of abundant surface intermediates. Mathematically,

adopted a smaller reaction network based on Grabow and Mavrikakis37 to showcase the method of nonlinear program (NLP) formulation of the parameter estimation problem and found solutions that had much lower overall error between microkinetic model and experiments. We use the full reaction network here as an illustrative example to showcase the performance of our method for parameter estimation and catalyst design. We refer the reader to the work of Grabow and Mavrikakis37 for the complete reaction network.



RESULTS: PARAMETER ESTIMATION The reaction mechanism and computational data from Grabow and Mavrikakis37 and experimental data from Graaf et al.67 are utilized for this study. The decision variables (or parameters) were deviations of the transition state energy of surface reactions and binding energy of surface intermediates from DFT values, viz., ΔETS and ΔBE, respectively. Five hundred Latin hypercube samples were tried with deviations from DFT values within ±100 kJ/mol (±1 eV). The objective was the sum of squares residuals (SSR) of the turnover frequency of methanol and water production; consequently, the weight matrix W is a diagonal matrix with all entries zero except those corresponding to water and methanol indices which are set to 1. Further, the penalty for the parameters δ was set at 10−4, the relative and absolute tolerances for integration were fixed to 10−6 and 10−8, respectively (the absolute tolerances were sometimes relaxed for stiff cases; however, tighter tolerance checks at the final point confirmed convergence), the optimization tolerance was set to 0.01 (to avoid noise from integration errors), and the nominal values of the parameters were set to zero. We first report the best (primary) solution consistent with that proposed by Grabow and Mavrikakis37 in terms of the chemistry. Subsequently, we will briefly discuss the nature of other solutions through a statistical analysis. The details of the reaction mechanism and specific DFT values are given in Grabow and Mavrikakis.37 The Supporting Information has the primary solution and details of the other solutions found whose SSR values were within 20% of the former. Primary Solution. Figure 3 shows the parity plot of the primary solution with an SSR value of 0.29 which corresponds to an average relative error of 12% for methanol and water turnover frequency. This compares favorably to the SSR value

XRC, j =

d ln(rMeOH) d ln kj

∀j ∈ J k j ′≠ j , K , j ′∈ J

(30)

where rMeOH is the rate of production of methanol and j′ is the reaction index running over all reactions J but not the specific reaction j for which XRC is sought. Under these conditions, both CO and CO2 contribute to the overall flux for methanol production; however, the CO2 pathway (denoting herein the path CO2* → HCO2* → HCOOH* → CH3O2* → CH2O* → CH3O* → CH3OH* → CH3OH, where * represents a surface-adsorbed state) contributes about 80% of the total methanol production rate; the water gas shift reaction converts 10% of CO to CO2*, while CO2 contributes over 70% of the carbon flux in methanol formation. The remainder of the carbon flux follows CO* → HCO* → CH2O* at which point this merges with the first pathway. Further, the hydrogenation steps on this pathway have the highest degree of

Figure 3. Model versus experiment turnover frequency (TOF) parity plot of the primary solution for the parameter estimation problem for methanol synthesis. Purple squares (■) indicate methanol TOF, red circles (●) indicate water TOF, and the dashed line is the parity line. F

DOI: 10.1021/acs.jpcc.7b08089 J. Phys. Chem. C XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry C

Figure 4. Reaction network for CO and CO2 hydrogenation under representative industrial conditions (P = 75 atm, T = 528 K, mole ratio of CO:CO2:H2 = 0.1:0.1:0.8 with total carbon inlet flow rate of 1 mol C/mol sites/s) based on forward simulation with the primary solution’s parameters. Species are colored on the basis of their coverage values (0−1), the width of the arrows is a qualitative indicator of the magnitude of the reaction flux; gas phase species are suffixed by “g”, while others are surface intermediates. Gas phase reactants (products) are in orange (maroon). Side products are prefixed by a “negative” sign.

consistent with that of Grabow and Mavrikakis.37 Studt et al.81 and van Rensburg et al.82 both show that the methanol synthesis rate for CO/CO2 mixed feeds is sensitive to the formate coverage which they predict to be very high (>0.8 ML) based purely on kinetics and thermochemistry values calculated using DFT on Cu(211) and CuZn(211). Studt et al.81 further propose that a high formate coverage reduces the CO hydrogenation rate and increases the contribution of CO2 to methanol synthesis. More recently, Kunkes et al.89 showed through H/D exchange studies that an inverse kinetic effect is observed for methanol that is not observed for CO formation (through reverse water gas shift reaction) on Cu/ZnO catalyst, indicating that methanol synthesis with purely CO2 feed does not share a common intermediate with the water gas shift (WGS) reactions. This is consistent with our mechanism in Figure 4 because the water gas shift reaction relates CO* and CO2* through the COOH* intermediate while the CO2-to-methanol pathway in itself does not involve this intermediate. Postprocessing Analysis of Multiple Solutions. The multistart approach (with 500 starts) furnished several solutions, each of which is a potential local minimum; however, due to numerical tolerances used and the highly nonlinear topography of the objective surface, several solutions may be almost identical in terms of the underlying reaction mechanism while having slightly different objective values. Figure 5 shows a plot of optimal objective values of the converged solutions and their rank (1 being the best). There are 33 solutions with an objective value of less than 0.35 which is about 20% higher than the objective value of the primary solution (0.29). A large number of solutions (about 400) have an objective around 0.4, about 50 solutions have an even higher objective, and around 20 starts diverged. The profile of the curve in Figure 5 is a clear indicator

Table 1. Degree of Rate Control of Methanol Synthesis and Reaction Flux Values for the Highest Rate-Controling Elementary Steps in the Methanol Synthesis Network under Industrial Conditions (P = 75 atm, T = 528 K, Mole Ratio of CO:CO2:H2 = 0.1:0.1:0.8 with Total Carbon Inlet Flow Rate of 1 mol C/mol sites/s) Using Parameters Derived in the Primary Solution reaction

flux (s−1)

XRC

HCOOH* + H* → CH3O2* CH3O* + H* → CH3OH* + * CO* + H* → HCO* + * H* + CO2* → HCO2* + *

0.038 0.038 0.018 0.038

0.28 0.27 0.19 0.09

Table 2. Surface Coverage Values of the Most Abundant Species in the Methanol Synthesis Network under Industrial Conditions (P = 75 atm, T = 528 K, Mole Ratio of CO:CO2:H2 = 0.1:0.1:0.8 with Total Carbon Inlet Flow Rate of 1 mol C/mol sites/s) Using the Parameters Derived in the Primary Solution species

coverage

CO* HCO2* CH3O* H* OH* * (free sites)

0.15 0.21 0.002 0.25 0.001 0.38

rate control although distributed among several reactions with no single rate controlling step. Only 40% of the surface is free, with formate (HCO2*), hydrogen (H*), and carbon monoxide (CO*) covering the majority of the surface. These results are G

DOI: 10.1021/acs.jpcc.7b08089 J. Phys. Chem. C XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry C

their Euclidean distance (closer to the bottom), the more similar are two solutions and lower in the hierarchy they are located and clustered. For example, solutions “20” and “230” are more similar to each other than they are to “259” (Figure 6). Note that the primary solution discussed earlier is solution “259”. Figure 6 also contains dashed lines corresponding to specific values of the mean difference (in kJ/mol) between two decision variables 2

in any two solutions. This is calculated as εd |π | , where |π| is the total number of parameters. It can be seen that, for mean difference values of of 5, 10, and 15 kJ/mol, there are eight, five, and two clusters, respectively. The existence of several clusters even for a moderate value of mean difference (10 kJ/mol) suggests multiple distinct solutions; that is, it is likely that there are multiple distinct reaction mechanisms that account for the experimental data. Figure 7 shows a heat map of the absolute values of species binding energy deviations (from DFT) of the good solutions. All binding energy deviations in all solutions were negative except for H* whose deviations, however, are always less than 20 kJ/mol; taking the absolute value does not, therefore, eliminate additional information. We can see that, while all solutions require mild or negligible changes to binding energy values of several species such as H*, CO*, H2O*, CO3*, and HCO3*, most solutions require substantial changes (>40 kJ/mol) to the binding energy values of CH2O*, CH3O2*, HCO2*, OH*, etc. Further, some solutions require mild changes to the binding energy of CO2*, while others require substantial changes (>50 kJ/mol). We can, therefore, infer that (a) all solutions require substantial deviations (>50 kJ/mol) of the binding energy of one or more species, indicating that the (111) facet on which all DFT calculations were performed is not the active region of the catalyst, (b) some species have different extents of binding energy deviations in different solutions potentially indicating differences in the underlying reaction mechanism in these solutions, and (c) some binding energy deviations are always small, implying that they are unimportant across the solution space. A similar trend is expected (not shown) for activation barrier (transition state energy) deviations.

Figure 5. Objective values of solutions as a function of the number of solutions that are better in terms of their objective value (mathematically, it is the rank of the solution −1). Shown are solutions with objective ≤1.0. Thirty three solutions (0−32) have an objective value of less than 0.35 and are categorized as “good” solutions.

of the need for a multistart scheme; an arbitrary initial guess for the decision variables would not guarantee the identification of the best (primary) solution. A detailed analysis of each solution is beyond the scope of this work; however, we perform a statistical comparison of these solutions, that we term as “good” (to represent those with objective ≤0.35). It should nevertheless be noted that the framework allows for exploring the solution space comprehensively. Figure 6 shows a dendrogram of an agglomerative hierarchical clustering with complete linkage of the 33 good solutions. This analysis identifies the (dis)similarity between two solutions on the basis of the Euclidean distance (εd) between the solution decision variables π (that is, εd(l , m) = ∥πl − πm∥ = ∑n ∈ N (πl ,n − πm,n)2 where l and m are the lth and mth solutions). The dendrogram is “agglomerative” because it shows a hierarchy of clusters starting with each element in their own cluster (at the bottom) to all elements clustered into one (at the apex); it is “complete linkage” because the Euclidean distance between two clusters is taken to be the maximum of pairwise Euclidean distances between components of one cluster and the other. The smaller



RESULTS: CATALYST DESIGN The primary solution from the parameter estimation problem was used as the reference for identifying catalyst parametersdeviations

Figure 6. Dendrogram of hierarchical agglomerative clustering with complete linkage and Euclidean distance metric for the set of good solutions (objective