Multiscale Modeling Combined with Active Learning for

Dec 10, 2018 - learning can effectively develop surrogate models for system tasks in multiscale modeling. □ INTRODUCTION .... evaluated with KMC and...
0 downloads 0 Views 3MB Size
Article pubs.acs.org/IECR

Cite This: Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Multiscale Modeling Combined with Active Learning for Microstructure Optimization of Bifunctional Catalysts Marcel Núñez and Dionisios G. Vlachos* Department of Chemical and Biomolecular Engineering, University of Delaware, Newark, Delaware 19716, United States

Ind. Eng. Chem. Res. Downloaded from pubs.acs.org by UNIV OF WINNIPEG on 12/23/18. For personal use only.

S Supporting Information *

ABSTRACT: A number of studies have recently demonstrated that catalyst microstructure and defect engineering are important to enhance reaction rates, but rigorous microstructure optimization studies are lacking. Kinetic rate prediction requires models that resolve catalyst sites and their coupling arising from surface diffusion, spatial arrangement of multifunctional sites, and lateral interactions. Modeling these effects requires kinetic Monte Carlo (KMC) simulation. The computational demand of KMC simulation makes direct microstructure optimization infeasible. To overcome this challenge, we parametrize the KMC data (reaction rate) using an active learning approach to capture complex structural dependencies among sites at negligible computational cost. We apply our method to a prototype chemistry over bifunctional materials, a case study reminiscent of the ammonia decomposition reaction on defected NiPt (core/shell) structures. We demonstrate that machine learning can effectively develop surrogate models for system tasks in multiscale modeling.



INTRODUCTION One of the objectives of process intensification is enhancement of transport and/or reaction rates. Increasing catalytic rates, via materials optimization, is therefore a key approach to achieving process intensification. This article focuses on this topic and specifically on microstructure optimization of multifunctional catalysts. We make the case that this hard problem can be tackled by combining multiscale modeling with machine learning methods. Highly dispersed metal nanoparticle catalysts are inherently heterogeneous, consisting of different sites where adsorbates can bind and react. The various sites often exhibit dramatically different activities and their relative abundance changes with catalyst size and shape, resulting in structure sensitivity. This principle has been exploited to engineer catalyst performance, such as activity and selectivity. Techniques for synthesis of shape selected nanoparticles have been developed1,2 and applied to various chemistries including formic acid oxidation,3 acetylene hydrogenation,4 NO reduction,5 benzene hydrogenation,6 trans-to-cis isomerization,7 and 2-propenol oxidation8 to mention a few. More recently, defects have been purposefully engineered on a single crystal for enhancing the rate of the oxygen reduction reaction.9 Computational work has been undertaken to understand structure sensitivity10−12 with the eventual goal of optimizing catalyst microstructure and providing synthesis targets for experimental studies. In optimizing catalyst activity, one seeks to find the catalyst microstructure that maximizes the reaction rate. Computation of the reaction rate on a site requires solving a microkinetic model on that site. When the governing equations and model parameters include spatial correlations or interactions with the © XXXX American Chemical Society

environment of that site, such as lateral interactions (known also as coverage effects), the thermochemistry and kinetic parameters on that site depend on the coverages of species on nearby sites and on the surface diffusion from and to that site,13 i.e., the governing equations must be solved simultaneously on an ensemble of sites. Modeling of multiple sites of varying activity, such as on a nanoparticle, requires again spatially resolved models. Similarly, bifunctional catalysis14−18 involves elementary steps invoking two adjacent or distant sites and possible diffusion of species between sites. As shown in the work of Hanselman et al.,19 the stability and geometric requirements of an active site affect the properties of nearby sites, excluding the possibility of every site having optimal properties. In all aforementioned cases, simultaneous solution of multiple microkinetic models, one on each site, is needed to properly account not only for the density but also the spatial arrangement20 of and coupling among active sites. A first-principles (electronic structure)-based optimization then requires generating catalyst microstructures, evaluating the thermochemistry and kinetic parameters on all sites of each microstructure for given coverages, via density-functional theory (DFT), carrying out microkinetic modeling across the unit cell of the material with parameters evaluated via DFT to Special Issue: Frameworks for Process Intensification and Modularization Received: Revised: Accepted: Published: A

October 1, 2018 December 3, 2018 December 10, 2018 December 10, 2018 DOI: 10.1021/acs.iecr.8b04801 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

Figure 1. Schematic of descriptor-based microkinetic modeling and construction of volcano curves. (a) Linear scaling relations (LSRs) estimate the binding energy (BE) of a species, e.g., CH3, vs the BE of a key heteroatom, carbon in this example. Group additivity can be used to extend these simple LSRs to large adsorbates. (b) Brønsted−Evans−Polanyi (BEP) relations relate the activation energy of a reaction Ea to the heat of reaction ΔH. (c) Reaction rate, estimated via microkinetic modeling (MKM), vs the heteroatom BE results in the classic (energy-based) volcano curve and provides an estimate of optimum BE for materials search. (d) Correlation of BE vs the coordination CN of a site. (e) Geometric-based volcano curve vs CN provides insights into the optimum CN of a site for enhancing the rate. This volcano curve holds for a single type of site and leaves unanswered how to pack geometrically multiple active sites given the geometric and stability constraints of a crystal and what are the optimal sites and their geometric arrangement for bifunctional materials.

NiPt is the most active known single crystal. In initial work, the high activity of NiPt was rationalized by well-established volcano plots34 and the N binding energy as a descriptor.35 Molecular dynamics,36 EXAFS,37 and other experimental methods38 have shown that defects are present in the NiPt structure. KMC simulations rationalized the high activity of NiPt based on defects of the core/shell structure based on an interplay between terrace and step sites.39,40 NH3* dissociation on terrace sites of Ni/Pt and N* association on step sites of Ni along and across Pt patches are competing rate-determining steps whose relative importance depends on reaction conditions. Lateral interactions were also shown to play a key role.41−43 Although such KMC studies can predict activity for a given catalyst microstructure, there is currently no way to solve the inverse problem of optimizing activity with respect to catalyst microstructure due to the computational cost of KMC simulation that prohibits its direct use in optimization. Here, we propose for the first time a computational approach that can allow microstructure catalyst optimization, using DFT-based descriptors, while properly accounting for the spatial coupling of multiple sites using the KMC method. Specifically, we develop a surrogate reaction rate model consisting of a decision tree and a neural network regressed to KMC reaction rate data. Initially, a database of randomly generated catalyst structures is built and their activity is evaluated via KMC simulation. The surrogate model is trained on these structures and subsequently employed in simulated annealing optimization to maximize catalyst activity. Due to inadequacies in the training set, an active learning approach is used wherein structures predicted by the optimization are evaluated with KMC and added to the database to refine the surrogate model. In doing so, computational effort is spent populating the database primarily with highly active structures, i.e., using importance sampling.44 KMC simulation, surrogate model training, and catalyst structure optimization are performed in an iterative fashion until convergence is reached. Our approach is demonstrated for a prototype version of the

estimate the spatially average rate and coverages, and evolving the catalyst structure, e.g., using a genetic algorithm, until the maximum rate is achieved. The large size of a given microstructure for extended systems (a large unit cell) and the huge number of surface structures required for optimization render direct use of DFT impractical. Practically, a descriptor-based microkinetic model is used (Figure 1a−c). Herein the activity of a surface site is correlated to a descriptor, such as the binding energy of a key intermediate. Brønsted− Evans−Polanyi (BEP) relationships21,22 connect heats of reactions, and thus the binding energies of species (the descriptor(s)), to activation energies (kinetic parameters) of a reaction network in a quantitative manner. Such an activity− descriptor relation is depicted as a volcano curve, known also as Sabatier’s principle. The volcano curve provides the optimum descriptor (binding energy) that can be used for materials discovery. This approach is now fairly mature. Recent work has shown that it is possible under certain conditions to relate the species binding energies to site attributes, such as geometric descriptor(s). In other words, the energetics of the surface reactions are (often linearly) correlated to geometric (and possibly electronic) descriptors of the sites, i.e., geometry−property relationships are built based on the site’s coordination number,9,23−27 Hamiltonians,28 or machine learning29 using DFT data. Taking geometry−property and activity-binding energy relations together, the microkinetic model on a site can be parametrized as a function of site attributes and volcano curves vs a geometric descriptor can be constructed (Figure 1d, e). Despite coupling among sites being ubiquitous in catalysis, mean field models, which assume spatial homogeneity and ignore spatial correlations, are commonplace. The effect of active-site coupling on catalyst activity can only be accurately evaluated using kinetic Monte Carlo (KMC) simulation, whose application to structure sensitive chemistries is detailed in several review papers.30−32 The NH3 decomposition reaction is a quintessential case.33 For example, it was found that although Ni and Pt are inactive for NH3 decomposition, B

DOI: 10.1021/acs.iecr.8b04801 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

Figure 2. (a) Example catalyst structure. The occupancies of Ni (green) atoms on the Pt (gray) substrate are specified by a vector σ. (b) Corresponding KMC lattice with terrace (blue) and edge (red) sites.

Table 1. Elementary Reactions in the A → B Site-Coupling Modela

ammonia decomposition reaction on defected NiPt catalysts consisting of two sites.



index

METHODS Below we describe the modeling of catalyst structures using the KMC method, the construction of the surrogate model from KMC data, the simulated annealing method for catalyst microstructure optimization, and the combination of these techniques into an active learning approach. Kinetic Monte Carlo Simulation of Defected Monolayer Catalysts. A prototype version of the NH3 decomposition model of Guo and Vlachos40 is chosen to demonstrate our methodology. A catalyst structure consists of a defected monolayer of Ni on a Pt substrate. The Atomic Simulation Environment (ASE)45 is used to create a four-layer fcc(111) slab consisting of three bottom layers of Pt and a top layer of Ni, periodic in two dimensions. The cell dimensions chosen for this study are p(d × d) where d = 12. Defects are introduced through vacancies in the Ni layer. The defected structure is represented numerically as a binary occupancy vector σ. Each of the n = d2 = 144 Ni sites, indexed by i ∈ {1, 2, ..., n}, is occupied (σi = 1) if a Ni atom is present and unoccupied (σi = 0) if a vacancy is present instead that exposes a Pt atom underneath. An example structure is shown in Figure 2a. The molecular structure is converted to a KMC lattice by identifying binding sites and their properties. The KMC lattice for the example structure in Figure 2a is shown in Figure 2b. Terrace and edge sites are identified in the Ni adlayer. Terrace sites are Ni top sites with six neighboring Ni atoms, whereas edge sites are Ni top sites with four neighboring Ni atoms and two neighboring vacancies adjacent to each other. For simplicity of the system, all other Ni sites do not participate in the chemistry. Terrace and edge sites are given prespecified energetic and kinetic properties, without the need to recalculate these properties for new structures. For realistic chemistry, this would be the case if sufficient DFT data existed to parametrize all kinetically relevant site types.40 Otherwise, properties could be computed on the fly based on a site’s local environment.9,23,25,26,28,29 The elementary steps and rate constants of the reaction network are shown in Table 1. All rate constants are the same order of magnitude to induce competing rate-determining steps. Adsorption of A occurs on terrace sites whereas desorption of B occurs on edge sites. Surface diffusion couples the two sites. Given a structure σ, we define rKMC(σ) as the steady state rate of production of a gas-phase species P obtained via KMC simulation, normalized by its stoichiometric coefficient in the net gas-phase reaction, as well as the size of the unit cell (n).

1 2 3 4 5

reaction A(g) ↔ At At + *t ↔ *t + At A t + * e ↔ * t + Be Be + *e ↔ *e + Be Be ↔ B(g) + *

kfwd −1

1.0 atm 1.0 s−1 1.0 s−1 1.0 s−1 1.0 s−1

kfwd/krev −1

s

1.0 atm−1 1.0 1.0 1.0 1.0 atm

a

The partial pressures of gas species A(g) and B(g) are PA = 1 atm and PB = 0 atm. The surface species are A, B, and empty sites (*). Subscripts indicate the site type as either terrace (t) or edge (e). Reactions 1−5 are adsorption, diffusion, surface reaction, diffusion, and desorption, respectively. kfwd and krev indicate forward and reverse rate constants, respectively.

The goal of structure optimization is to maximize rKMC(σ) with respect to σ. Stochastic noise in the evaluation of rKMC(σ) from KMC simulation is managed by time averaging over sufficiently large time intervals. The Zacros46,47 graph-theoretical software is used to simulate the reaction network. To ensure steadystate conditions, we use the automated methods developed in previous work.48 Five replicate trajectories are used for each structure to improve sampling. KMC simulation length is extended until the half-length of the 90% confidence interval of the steady-state rate is within 5% of the rate. To extract maximal data from each simulation, rates are computed for each Ni site. Given the steady-state expected propensities of each elementary step, the rate of production of gas species P at site i is n 1 rxns νj ,P rP,KMC ∑ E[ai ,j] i (σ ) = νP j = 1 sj (1) νP is the stoichiometric coefficient of species P in the net gas phase reaction. j indexes the elementary steps, of which there are nrxns. νj,P is the stoichiometric coefficient of gas species P in elementary step j. sj is the number of sites involved in elementary step j, which prevents double counting for reactions involving multiple sites. E[ai,j] denotes the steady state expectation of the propensity of elementary step j at Ni site i, as estimated from statistical sampling. If an Ni site i does not have a corresponding lattice site in the KMC simulation, then rKMC P,i (σ) = 0. The total rate is computed as 1 n r KMC(σ ) = ∑ rP,KMC i (σ ) n i=1 (2) C

DOI: 10.1021/acs.iecr.8b04801 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research Steady state mass balances render the total rate rKMC(σ) independent of the choice of P. For our prototype reaction network, we use P = B and νB = 1 to track the production of B. We exploit the symmetry of the (111) facet to maximize data usage from a single KMC calculation, i.e. augmenting the data set. For Ni site i, we define the translation operator Ti(σ) as the permutation of σ that translates the catalyst so that Ni site i is mapped to i = 1. For θ ∈ {0, 120, 240}, we define the rotational operator Rθ(σ) as the permutation of σ that rotates the catalyst by θ degrees counterclockwise about Ni site i = 1. For any combination of i, θ, P, and σ, symmetry dictates that KMC rP,1 (R θ(Ti(σ ))) = rP,KMC i (σ )

rectified linear unit (RELU) activation function while the output layer uses the identity function. The site rate data is normalized between 0 and 1 prior to regression. Hyperparameters are chosen by minimizing the prediction error when regressed to the initial KMC database. Twenty hidden nodes were sufficient to achieve a reasonable fit to the training data. In the scikit-learn implementation, a value of α = 0.1 is used as a regularization parameter, which adds the L2-norm of the neural network weights to the cost function to avoid overfitting. A learning rate of 0.0001 and 10 000 maximum iterations are used for the stochastic gradient descent optimization algorithm. The training terminates when the squared loss cost function stops changing by at least 0.00001 for 10 iterations. The site rates and the total rate are computed using the surrogate model as

(3)

The right-hand side of eq 3 is known from simulation. Using KMC data for a single structure and all possible combinations of i and k gives 3n distinct mappings of σ to rKMC P,1 (σ). Surrogate Model. A surrogate model (rsurr(σ)) of low computational cost that possesses the same global maximum as rKMC(σ) is needed for structure optimization. The surrogate model is built using a decision tree and a neural network as implemented in the scikit-learn Python package.49 The sitespecific rates rather than the total rate are used as training data. Most Ni sites have zero rate due to either lack of a corresponding KMC lattice site or lack of nearby complementary sites for bifunctional site coupling. As a result, the data set is unbalanced, and a combination of classification and regression is used. Figure 3 shows a diagram of the surrogate

DT rPsurr (Ti(σ ))*f NN (Ti(σ )) , i (σ ) = f

r surr(σ ) =

1 n surr ∑ r P , i (σ ) n i=1

(4)

(5)

The overall rate (rsurr(σ)), being a spatial average of the site rates (rsurr P,i (σ)), is less noisy than for the individual site rates. However, this effect is counterbalanced by the larger noise in the KMC site rates (rKMC P,i (σ)) used in the training data. Simulated Annealing-Based Optimization. For optimization of rsurr(σ), simulated annealing is employed, as implemented in our own code available in the supplement. Each Metropolis step chooses an element of σ at random and attempts to flip it (1 to 0 or 0 to 1). The objective function is evaluated at each step using the surrogate model (eq 5). The total number of steps (smax) and the cooling schedule are specified. An exponential cooling schedule is used, as it offers a suitable trade-off between ease of implementation and performance,51 giving arbitrarily high probability of achieving the global optimum dependent on the number of steps.52 The temperature T(s) at step s is i sy T (s) = T0expjjj− zzz k τ{

Figure 3. Schematic of the surrogate model.

(6)

The parameter T0 is the initial temperature and is set to the highest site rate observed in the training set, thus scaling the temperature by expected changes in the structure rate. The number of steps is determined based on the number of Ni sites, as n is the maximum number of Metropolis steps necessary to transition between any possible structure and the optimal one. We find that smax = 100n = 14 400 steps are sufficient in all cases. τ is the relaxation time scale chosen to be 1 τ = 5 smax . For computational efficiency, a full list of all translations of the catalyst is maintained and updated to evaluate the summation in eq 5. Active Learning (AL). Initially, 50 structures are randomly generated by choosing a fixed Ni coverage and seeding Ni atoms at random locations, resulting in a variety of quantities and spatial arrangements of terrace and edge sites. It is important that at least some structures in the initial database have active sites, so that the surrogate model learns to identify potential active sites. KMC simulations are performed for the structures to create a KMC database of site and structuredependent rates. Given that machine learning tools are highly interpolative, the surrogate model converges to a false global

DT

model. First, a decision tree (f (σ)) classifies each site as either active or inactive. That is, f DT(σ) = 1 if rsurr P,1 (σ) > 0 and f DT(σ) = 0 if rsurr P,1 (σ) = 0. The input features are the elements of the occupancy vector σ. The maximum depth of the tree is adjusted to maximize predictive accuracy, based on holdout validation with a 75%/25% training set/test set split of the KMC database. A feedforward neural network is used to regress the rates of Ni sites with nonzero rate. Site rates less than 3% of the maximum in the data set are also excluded to prevent regression from predicting unphysical negative values. These sites have negligible impact on the overall structure rate. Neural networks are widely used for spatial pattern recognition,50 as the interconnectedness of perceptrons enables it to capture the geometric relations between inputs, e.g. pixel values in image applications or site occupancies in our application, and model nonlinear responses. The structure of the neural network (f NN(σ)) consists of an input layer with one node for each occupancy in σ, 1 hidden layer, and an output node predicting rsurr P,1 . The hidden layer nodes use a D

DOI: 10.1021/acs.iecr.8b04801 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

minimized for the test set when a maximum tree depth of 20 is used.

optimum if the true optimum is dissimilar to the structures in the database.53 To alleviate the issue of false convergence, optimization is performed iteratively using an AL approach. The predicted optimal structures are evaluated with KMC and added to the database, after which the surrogate model is retrained on the augmented database. Incremental learning is not employed. The refined surrogate model is used in subsequent simulated annealing optimization, using the previous optimum as the initial structure. The process is repeated until convergence, wherein the predicted optimal structure is evaluated correctly with the surrogate model and does not change with the addition of new data to the KMC database. Parallelization is exploited to grow the database more rapidly. After evaluating the initial structures and adding them to the database, each of 16 processors perform the AL procedure independently. The processors communicate only by using data from the same database to which they all add. Using an ensemble of 16 structures on separate processors better explores the sampling space of structures for surrogate model regression, making its predictivity more global. The AL process is shown in Figure 4 and summarized in the following pseudocode.

Figure 5. Classification error for the decision tree when trained to the initial database using different maximum depths.

The parity plot for the neural network regression to individual site rates in the final database at Iteration 4 is shown in Figure 6. Variability in the site rates shows that they

Figure 4. Flowchart of the active learning (AL) algorithm.



1. Generate initial training structures. 2. Estimate the rate for the training structures using the KMC method. Record site-dependent rates in a database. 3. Train the surrogate model on the structures in the database. a. Train decision tree to classify each site as having a zero or nonzero rate. b. Train neural network on all sites with nonzero rates. 4. Optimize catalyst structure using simulated annealing and the surrogate model (rsurr(σ)) as the objective function. 5. Evaluate predicted optimal structures with KMC, i.e., compute rKMC(σ). 6. If the optima have been evaluated correctly, i.e. within the noise of the KMC, with the surrogate model and are the best historically, then terminate. Otherwise, add them to the KMC database and return to step 3.

Figure 6. Parity plot of site rates (rA,1(σ)), in units of molecules of B produced per second.

vary depending on the local microstructure. The mean square error of site rate predictions for active sites in optimal structures decreases from 0.05 in iteration 1 to 0.01 in iteration 4 as more highly active sites are added to the database. Although the neural network does not completely resolve the activities, especially for the less active sites which are of minor importance in the optimization, the fit is sufficient for prediction of the total structure rate, as shown below. Improvements to the neural network through feature selection, possibly using clusters,28 is a promising avenue for future research. The performance of the surrogate model in predicting catalyst activity is shown in Figure 7 for four iterations of the AL process, each of which adds 16 structures to the database. In all iterations, simulated annealing optimization using the surrogate model finds the global maximum of rsurr(σ), resulting in the green points (representing the simulated annealing optima) always being higher than the blue points. In the first iteration, inaccuracies in the neural network cause the surrogate model to converge to false optima. The green points appear to the right of the blue points but do not yet reach the global maximum of rKMC(σ). The surrogate model predictions

RESULTS Active Learning. The maximum depth of the decision tree was chosen to be 20, on the basis of minimizing the prediction error when splitting the initial database into a 75% training and 25% test set. Figure 5 shows how the classification error is E

DOI: 10.1021/acs.iecr.8b04801 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

Figure 7. Parity plots of total structure rates (r(σ)) at the beginning (top left) and end (bottom right) of the active learning process. Green points are predicted by simulated annealing optimization using a surrogate model regressed to the data in the blue points. The number in the legend indicates the number of structures in the training set.

optimal structure. Eventually, the global optimum of rKMC(σ) is achieved. Relative to the most active structure in the initial KMC database, the global optimum has an activity higher by a factor of about 5. The AL process successfully discovers new and better structures starting with random structures. Physical Insights. The frequencies of terrace and edge sites in each structure are shown in Figure 9. Points are colored to indicate the activity of the corresponding structure, providing insights into how bifunctionality affects the catalyst structure−activity relationship. Figure 10 shows the optimal structure, which exhibits several desirable properties. Not only does it have an abundance of both terrace and step sites, but it places them in close proximity so that after species A adsorbs onto a terrace site, it rapidly diffuses to an edge site where it desorbs as B. Analysis of KMC simulation of the optimal structure confirms that most of the sites on the surface participate in the reaction network, providing a degree of validation that the global optimum has been achieved. CPU Analysis. Figure 11 decomposes the central processing unit (CPU) requirements of the AL approach as well as the estimated time from the brute force approach in which KMC is used at every step of optimization. The CPU cost of the brute force approach is purely due to the KMC simulation performed at each of the 14 400 steps in the simulated annealing optimization. The AL approach requires 2000 times fewer KMC evaluations but introduces the additional costs of regressing and evaluating the neural network. These extra tasks require far less CPU time than KMC evaluation does. The overall savings of the AL approach relative to brute force is a factor of 1600 for our example. Our CPU evaluations do not account for wall time reduction due to

improve after more data has been added to the database. At iteration 4, in which 98 structures are used to regress the surrogate model, the surrogate model estimates the activity of all optimal structures to within the 5% confidence allowed by the KMC sampling. To better visualize the convergence of the green points in Figure 7, Figure 8 shows the surrogate model predictions and KMC evaluations of the most active structure predicted by simulated annealing versus the size of the KMC database. As the size of the KMC database increases, the surrogate model achieves better agreement with KMC for the activity of the

Figure 8. Surrogate model predictions (rsurr(σ)) and KMC evaluations (rKMC(σ)) of the most active structure given by simulated annealing optimization at each iteration of the active learning process, versus the number of structures in the KMC database. The red star indicates the most active structure in the initial KMC database as a point of comparison. F

DOI: 10.1021/acs.iecr.8b04801 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

Figure 11. CPU comparison of the brute force approach (projected cost of using KMC directly in optimization) versus our active learning (AL) approach. Relative to brute force, AL requires significantly fewer KMC simulations but introduces the additional costs of training (Surr. train) and evaluating the neural network (Surr eval.).

Figure 9. Fraction of Ni sites that are edge and terrace sites for all 114 structures encountered during optimization. Site fractions are normalized by the total number of Ni sites (n = 144). Because many Ni sites are not identified as either type of site, the fractions do not sum to 1. Markers are colored on a blue (least active) to red (most active) scale to qualitatively indicate the activity of each structure.

consisting of a decision tree and a neural network, was regressed to data from KMC simulation of various (training) microstructures. The surrogate model identified active sites and structures and predicted activity with sufficient accuracy for simulated annealing optimization. An active learning method was exploited whereby the predicted optimal structures were evaluated with the KMC method to assess the accuracy of the surrogate model, and these microstructures were added to the database, as long as the surrogate model was not sufficiently accurate. We applied our technique to a prototype chemistry that mimics the ammonia chemistry exhibiting bifunctional behavior between terrace sites and step sites of defected Ni adlayers on a Pt substrate. In the ammonia chemistry, terrace sites perform dehydrogenation of NHx species and step sites facilitate N−N association and N2 desorption. The accuracy of the surrogate model improved as more structures were added to the KMC database. Simulated annealing optimization predicted an optimal structure with an abundance of terrace and step sites in close proximity to each other. For example, closely packed, narrow stripes of active sites provide high rates. Subsequent KMC evaluation confirmed the activity of such structures to be 5 times greater than the best among the initially randomly generated training structures. Use of the surrogate model in optimization reduced the number of KMC evaluations by >103. Core−shell catalysts as well as metal nanoparticles on oxide supports are commonly used and expose a variety of active sites. Computational methods for estimating binding properties at different sites are continuously being improved. Using such active site−property relations, our methodology can

parallelization, as several strategies are available to control this. For example, a genetic algorithm would allow for parallel evaluation of the objective function for every individual in a population. Given that typical KMC simulations for real chemistry problems usually require run times on the order of hours to days, the CPU savings of the AL approach would be significant.



CONCLUSIONS The development of descriptor-based microkinetic modeling is propelling the search for optimal catalysts. However, the active site has been up to now preselected and coupling between sites, manifested among others in multifunctional catalysts, has not been addressed. The quest for predictive modeling and active site optimization requires spatial resolution models that account for inhomogeneities arising from multiple sites and species lateral interactions, spatial correlations, coupling among sites, structural defects, etc. Optimization of catalyst performance via microstructure engineering requires the evaluation of reaction rates on complex microstructures, using the kinetic Monte Carlo (KMC) approach. The computational cost of KMC simulations makes the optimization task impractical. Herein multiscale modeling is combined with machine learning for computational optimization of catalyst microstructure for chemistries with coupled active sites. A surrogate model,

Figure 10. (a) Molecular picture of the catalyst structure with an optimal arrangement of a defected Ni adlayer (green) and a Pt substrate (gray). (b) Corresponding KMC lattice with terrace (blue) and edge (red) sites. G

DOI: 10.1021/acs.iecr.8b04801 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

(10) Van Santen, R. A. Complementary Structure Sensitive and Insensitive Catalytic Relationships. Acc. Chem. Res. 2009, 42, 57−66. (11) Koper, M. T. M. Structure sensitivity and nanoscale effects in electrocatalysis. Nanoscale 2011, 3, 2054−2073. (12) Carchini, G.; Almora-Barrios, N.; Revilla-López, G.; Bellarosa, L.; García-Muelas, R.; García-Melchor, M.; Pogodin, S.; Błoński, P.; López, N. How Theoretical Simulations Can Address the Structure and Activity of Nanoparticles. Top. Catal. 2013, 56, 1262−1272. (13) Andersen, M.; Plaisance, C. P.; Reuter, K. Assessment of meanfield microkinetic models for CO methanation on stepped metal surfaces using accelerated kinetic Monte Carlo. J. Chem. Phys. 2017, 147, 152705−152718. (14) Germani, G.; Schuurman, Y. Water-gas shift reaction kinetics over μ-structured Pt/CeO2/Al2O3 catalysts. AIChE J. 2006, 52, 1806−1813. (15) Pandelov, S.; Stimming, U. Reactivity of monolayers and nanoislands of palladium on Au(1 1 1) with respect to proton reduction. Electrochim. Acta 2007, 52, 5548−5555. (16) Wang, L.; Stimming, U.; Eikerling, M. Kinetic Model of Hydrogen Evolution at an Array of Au-Supported Catalyst Nanoparticles. Electrocatalysis 2010, 1, 60−71. (17) Merte, L. R.; Peng, G.; Bechstein, R.; Rieboldt, F.; Farberow, C. A.; Grabow, L. C.; Kudernatsch, W.; Wendt, S.; Laegsgaard, E.; Mavrikakis, M.; Besenbacher, F. Water-Mediated Proton Hopping on an Iron Oxide Surface. Science 2012, 336, 889−893. (18) Mironenko, A. V.; Vlachos, D. G. Conjugation-Driven ”reverse Mars-van Krevelen”-Type Radical Mechanism for Low-Temperature C-O Bond Activation. J. Am. Chem. Soc. 2016, 138, 8104−8113. (19) Hanselman, C. L.; Gounaris, C. E. A mathematical optimization framework for the design of nanopatterned surfaces. AIChE J. 2016, 62, 3250−3263. (20) Di Iorio, J. R.; Nimlos, C. T.; Gounder, R. Introducing Catalytic Diversity into Single-Site Chabazite Zeolites of Fixed Composition via Synthetic Control of Active Site Proximity. ACS Catal. 2017, 7, 6663−6674. (21) Honkala, K.; Hellman, A.; Remediakis, I. N.; Logadottir, A.; Carlsson, A.; Dahl, S.; Christensen, C. H.; Nørskov, J. K. Ammonia synthesis from first-principles calculations. Science 2005, 307, 555− 558. (22) Nørskov, J. K.; Bligaard, T.; Hvolbaek, B.; Abild-Pedersen, F.; Chorkendorff, I.; Christensen, C. H. The nature of the active site in heterogeneous metal catalysis. Chem. Soc. Rev. 2008, 37, 2163−2171. (23) Mpourmpakis, G.; Andriotis, A. N.; Vlachos, D. G. Identification of descriptors for the CO interaction with metal nanoparticles. Nano Lett. 2010, 10, 1041−1045. (24) Ouyang, G.; Zhu, K.-J.; Zhang, L.; Cui, P.-F.; Teng, B.-T.; Wen, X.-D. Effects of coordination number of Au catalyst on oxygen species and their catalytic roles. Appl. Surf. Sci. 2016, 387, 875−881. (25) Calle-Vallejo, F.; Martínez, J. I.; García-Lastra, J. M.; Sautet, P.; Loffreda, D. Fast prediction of adsorption properties for platinum nanocatalysts with generalized coordination numbers. Angew. Chem., Int. Ed. 2014, 53, 8316−8319. (26) Calle-Vallejo, F.; Loffreda, D.; Koper, M. T. M.; Sautet, P. Introducing structural sensitivity into adsorption-energy scaling relations by means of coordination numbers. Nat. Chem. 2015, 7, 403−410. (27) Ma, X.; Xin, H. Orbitalwise Coordination Number for Predicting Adsorption Properties of Metal Nanocatalysts. Phys. Rev. Lett. 2017, 118, 036101−036105. (28) Guo, W.; Vlachos, D. G. Effect of local metal microstructure on adsorption on bimetallic surfaces: Atomic nitrogen on NiPt(111). J. Chem. Phys. 2013, 138, 174702−174709. (29) Ulissi, Z. W.; Tang, M. T.; Xiao, J.; Liu, X.; Torelli, D. A.; Karamad, M.; Cummins, K.; Hahn, C.; Lewis, N. S.; Jaramillo, T. F.; Chan, K.; Nørskov, J. K. Machine-Learning Methods Enable Exhaustive Searches for Active Bimetallic Facets and Reveal Active Site Motifs for CO2 Reduction. ACS Catal. 2017, 7, 6600−6608. (30) Salciccioli, M.; Stamatakis, M.; Caratzoulas, S.; Vlachos, D. G. A review of multiscale modeling of metal-catalyzed reactions:

predict spatial arrangement of sites to improve catalyst performance for structure sensitive chemistries. Once optimal structures are predicted, experimental efforts to synthesize them would provide crucial benchmarking. Our optimization approach can be used in tandem with experimental characterization of catalyst structure using microscopy and spectroscopic methods, e.g., X-ray absorption spectroscopy (XAS).



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.iecr.8b04801.



Structure optimization code at the time of publication (ZIP)

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Phone: 302-831-2830. ORCID

Dionisios G. Vlachos: 0000-0002-6795-8403 Notes

The authors declare no competing financial interest. The software implementation of the active learning approach to optimize active site coupling is available at https://github. com/VlachosGroup/Structure-Optimization.



ACKNOWLEDGMENTS Financial support from the Defense Advanced Research Project Agency under grant W911NF-15-2-0122 is gratefully acknowledged.



REFERENCES

(1) Somorjai, G. A.; Park, J. Y. Colloid Science of Metal Nanoparticle Catalysts in 2D and 3D Structures. Challenges of Nucleation, Growth, Composition, Particle Shape, Size Control and Their Influence on Activity and Selectivity. Top. Catal. 2008, 49, 126−135. (2) Roldan Cuenya, B. Metal nanoparticle catalysts beginning to shape-up. Acc. Chem. Res. 2013, 46, 1682−1691. (3) Xia, X.; Choi, S.-I.; Herron, J. A.; Lu, N.; Scaranto, J.; Peng, H.C.; Wang, J.; Mavrikakis, M.; Kim, M. J.; Xia, Y. Facile synthesis of palladium right bipyramids and their use as seeds for overgrowth and as catalysts for formic acid oxidation. J. Am. Chem. Soc. 2013, 135, 15706−15709. (4) Kim, S. K.; Kim, C.; Lee, J. H.; Kim, J.; Lee, H.; Moon, S. H. Performance of shape-controlled Pd nanoparticles in the selective hydrogenation of acetylene. J. Catal. 2013, 306, 146−154. (5) Renzas, J. R.; Zhang, Y.; Huang, W.; Somorjai, G. A. Rhodium nanoparticle shape dependence in the reduction of NO by CO. Catal. Lett. 2009, 132, 317−322. (6) Bratlie, K. M.; Lee, H.; Komvopoulos, K.; Yang, P.; Somorjai, G. A. Platinum nanoparticle shape effects on benzene hydrogenation selectivity. Nano Lett. 2007, 7, 3097−3101. (7) Lee, I.; Delbecq, F.; Morales, R.; Albiter, M. a.; Zaera, F. Tuning selectivity in catalysis by controlling particle shape. Nat. Mater. 2009, 8, 132−138. (8) Mostafa, S.; Behafarid, F.; Croy, J. R.; Ono, L. K.; Li, L.; Yang, J. C.; Frenkel, A. I.; Cuenya, B. R. Shape-dependent catalytic properties of Pt nanoparticles. J. Am. Chem. Soc. 2010, 132, 15714−15719. (9) Calle-Vallejo, F.; Tymoczko, J.; Colic, V.; Vu, Q. H.; Pohl, M. D.; Morgenstern, K.; Loffreda, D.; Sautet, P.; Schuhmann, W.; Bandarenka, A. S. Finding optimal surface sites on heterogeneous catalysts by counting nearest neighbors. Science 2015, 350, 185−189. H

DOI: 10.1021/acs.iecr.8b04801 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

parallel processing and rate constant rescaling. J. Chem. Phys. 2017, 147, 164103. (49) Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; Vanderplas, J.; Passos, A.; Cournapeau, D.; Brucher, M.; Perrot, M.; Duchesnay, É . Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825−2830. (50) Egmont-Petersen, M.; de Ridder, D.; Handels, H. Image processing with neural networks - a review. Pattern Recognition 2002, 35, 2279−2301. (51) Nourani, Y.; Andresen, B. A comparison of simulated annealing cooling strategies. J. Phys. A: Math. Gen. 1998, 31, 8373. (52) Hajek, B. Optimization by simulated annealing: a necessary and sufficient condition for convergence. Lecture Notes-Monograph Series 1986, 8, 417−427. (53) Jin, Y.; Olhofer, M.; Sendhoff, B. A Framework for Evolutionary Optimization With Approximate Fitness Functions. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 2002, 6, 481− 494.

Mechanism development for complexity and emergent behavior. Chem. Eng. Sci. 2011, 66, 4319−4355. (31) Stamatakis, M.; Vlachos, D. G. Unraveling the Complexity of Catalytic Reactions via Kinetic Monte Carlo Simulation: Current Status and Frontiers. ACS Catal. 2012, 2, 2648−2663. (32) Stamatakis, M. Kinetic modelling of heterogeneous catalytic systems. J. Phys.: Condens. Matter 2015, 27, 13001−13028. (33) Karim, A. M.; Prasad, V.; Mpourmpakis, G.; Lonergan, W. W.; Frenkel, A. I.; Chen, J. G.; Vlachos, D. G. Correlating Particle Size and Shape of Supported Ru/γ-Al 2 O 3 Catalysts with NH3 Decomposition Activity. J. Am. Chem. Soc. 2009, 131, 12230−12239. (34) Vojvodic, A.; Medford, A. J.; Studt, F.; Abild-Pedersen, F.; Khan, T. S.; Bligaard, T.; Nørskov, J. K. Exploring the limits: A lowpressure, low-temperature HaberBosch process. Chem. Phys. Lett. 2014, 598, 108−112. (35) Hansgen, D. A.; Vlachos, D. G.; Chen, J. G. Using first principles to predict bimetallic catalysts for the ammonia decomposition reaction. Nat. Chem. 2010, 2, 484−489. (36) Wang, H. Y.; Stamatakis, M.; Hansgen, D. A.; Caratzoulas, S.; Vlachos, D. G. Understanding mixing of Ni and Pt in the Ni/Pt(111) bimetallic catalyst via molecular simulation and experiments. J. Chem. Phys. 2010, 133, 224503−224513. (37) Tupy, S. A.; Karim, A. M.; Bagia, C.; Deng, W.; Huang, Y.; Vlachos, D. G.; Chen, J. G. Correlating ethylene glycol reforming activity with in situ EXAFS detection of Ni segregation in supported NiPt bimetallic catalysts. ACS Catal. 2012, 2, 2290−2296. (38) Kitchin, J. R.; Khan, N. A.; Barteau, M. A.; Chen, J. G.; Yakshinskiy, B.; Madey, T. E. Elucidation of the active surface and origin of the weak metal-hydrogen bond on Ni/Pt(1 1 1) bimetallic surfaces: a surface science and density functional theory study. Surf. Sci. 2003, 544, 295−308. (39) Guo, W.; Stamatakis, M.; Vlachos, D. G. Design Principles of Heteroepitaxial Bimetallic Catalysts. ACS Catal. 2013, 3, 2248−2255. (40) Guo, W.; Vlachos, D. G. Patched bimetallic surfaces are active catalysts for ammonia decomposition. Nat. Commun. 2015, 6, 8619− 8625. (41) Mhadeshwar, A.; Kitchin, J.; Barteau, M.; Vlachos, D. The Role of Adsorbate-Adsorbate Interactions in the Rate Controlling Step and the Most Abundant Reaction Intermediate of NH3 Decomposition on Ru. Catal. Lett. 2004, 96, 13−22. (42) Ulissi, Z.; Prasad, V.; Vlachos, D. G. Effect of multiscale model uncertainty on identification of optimal catalyst properties. J. Catal. 2011, 281, 339−344. (43) Guo, W.; Vlachos, D. G. On factors controlling activity of submonolayer bimetallic catalysts: Nitrogen desorption. J. Chem. Phys. 2014, 140, 014703−014709. (44) Ludwig, J.; Vlachos, D. G. Ab initio molecular dynamics of hydrogen dissociation on metal surfaces using neural networks and novelty sampling. J. Chem. Phys. 2007, 127, 154716. (45) Hjorth Larsen, A.; Jørgen Mortensen, J.; Blomqvist, J.; Castelli, I. E.; Christensen, R.; Dułak, M.; Friis, J.; Groves, M. N.; Hammer, B.; Hargus, C.; Hermes, E. D.; Jennings, P. C.; Bjerre Jensen, P.; Kermode, J.; Kitchin, J. R.; Leonhard Kolsbjerg, E.; Kubal, J.; Kaasbjerg, K.; Lysgaard, S.; Bergmann Maronsson, J.; Maxson, T.; Olsen, T.; Pastewka, L.; Peterson, A.; Rostgaard, C.; Schiøtz, J.; Schütt, O.; Strange, M.; Thygesen, K. S.; Vegge, T.; Vilhelmsen, L.; Walter, M.; Zeng, Z.; Jacobsen, K. W. The atomic simulation environment - A Python library for working with atoms. J. Phys.: Condens. Matter 2017, 29, 273002−273031. (46) Stamatakis, M.; Vlachos, D. G. A graph-theoretical kinetic Monte Carlo framework for on-lattice chemical kinetics. J. Chem. Phys. 2011, 134, 214115−214127. (47) Nielsen, J.; D’Avezac, M.; Hetherington, J.; Stamatakis, M. Parallel kinetic Monte Carlo simulation framework incorporating accurate models of adsorbate lateral interactions. J. Chem. Phys. 2013, 139, 224706. (48) Núñez, M.; Robie, T.; Vlachos, D. G. Acceleration and sensitivity analysis of lattice kinetic Monte Carlo simulations using I

DOI: 10.1021/acs.iecr.8b04801 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX