Rational Density Functional Selection Using Game Theory - Journal of

2 days ago - Herein we circumvent the user-centric selection procedure by describing a novel approach for objectively selecting a particular functiona...
0 downloads 10 Views 1MB Size
Article pubs.acs.org/jcim

Cite This: J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Rational Density Functional Selection Using Game Theory Suzanne McAnanama-Brereton† and Mark P. Waller*,‡ †

Theoretische Organische Chemie, Organisch-Chemisches Institut der Universität Münster, Correnstraße, D-48149 Münster, Germany ‡ Department of Physics and International Centre for Quantum and Molecular Structures, Shanghai University, Shanghai 200444, People’s Republic of China ABSTRACT: Theoretical chemistry has a paradox of choice due to the availability of a myriad of density functionals and basis sets. Traditionally, a particular density functional is chosen on the basis of the level of user expertise (i.e., subjective experiences). Herein we circumvent the user-centric selection procedure by describing a novel approach for objectively selecting a particular functional for a given application. We achieve this by employing game theory to identify optimal functional/basis set combinations. A three-player (accuracy, complexity, and similarity) game is devised, through which Nash equilibrium solutions can be obtained. This approach has the advantage that results can be systematically improved by enlarging the underlying knowledge base, and the deterministic selection procedure mathematically justifies the density functional and basis set selections.



we turned our attention to finding an evidence-based way of deciding the optimal functional and basis set for a given problem from among the currently available functionals and basis sets. We present the Decider, our program for solving such a challenge using game theory in combination with the aforementioned benchmark data sets. Game theory is a strong branch of applied mathematics that attempts to mathematically map strategy space and find so-called Nash equilibria for a broad range of games.19 It was developed to objectively study conflict and cooperation between entities. The power of such a method is that one may study fictitious play designed to emulate real processes in order to obtain a range of outcomes a priori, including the best-case scenario (BCS) or even the worst-case scenario (WCS). Concepts from game theory apply whenever the actions of several agents are interdependent. These agents may be individuals, groups, firms, or any combination of these. The concepts of game theory provide a language to formulate, structure, analyze, and understand strategic scenarios. Game theory has been successfully applied in such diverse areas as economics, military strategies, biology,20−23 and even chemistry.24,25 To the best of our knowledge, game theory has no precedent in the area of functional selection to date. In game theory, there is a set of players, each with a set of strategies S. Associated with each strategy Si is a payoff, i.e., the expected utility of a given strategy. A Nash equilibrium (NE) is an optimal strategy in game theory, given that all players have differing payoffs for different strategies. It is a strategy that, when all players have full knowledge of their own and their opponents’ payoffs for each strategy, every rational player will choose. It is not necessarily the best option for any single player, but an optimized strategy for all players involved.

INTRODUCTION The Hohenberg−Kohn formalism1 proved that an exact functional exists in principle, although to date no known prescription to obtain such an invaluable tool has been described. As there is currently no systematic way to improve density functionals, an academic dichotomy of sorts has developed. On one side is an argument for the general applicability of functionals. Researchers in this camp select well-performing functionals that have no known significant deficiencies for a given problem. On the other side is an argument for the specificity of functionals. These researchers prefer to use a customized functional for a given problem type, which has the advantage of increased accuracy at the expense of range of applicability. Many functionals now exist, as evidenced by the need for a density functional repository.2 Unfortunately, in stark contrast to the situation of functional development, there exist far fewer research endeavors that seek to produce and collate large, accurate, and diverse data sets for density functional validation.3,4 In regard to diversity we must acknowledge novel work in this direction, e.g., the “Mindless” benchmark data set.5 The GMTKN24 data set proposed by Grimme and co-workers,6−8 which was recently replaced by the GMTKN55 data set,9 contains a diverse superset of 24 chemically different subsets and provides a good general impression of the relative strengths and weaknesses of particular functionals across a diverse range of chemistry. Specialists deliberating over the relative strengths and weaknesses of a variety of functionals across carefully collated data sets will never erode bias.10−18 Therefore, evidence-based information must be successfully transferred in a timely fashion and not buried away in a large number of overwhelming tables peppered in the literature. The persistent popularity of certain functionals (e.g., B3LYP) in the scientific literature may be at least partially due to this time delay between density functional developers and density functional users. In order to try to reduce this gap, © XXXX American Chemical Society

Received: September 8, 2017

A

DOI: 10.1021/acs.jcim.7b00542 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling



where lzeta, lpol, and ldiff were determined by running a set of optimization calculations on methane, benzene, naphthalene, and anthracene using all combinations of the basis sets and functionals in Table 2. The basis sets were chosen so that they

METHODS We have devised a game of three players to best represent the options posed to researchers choosing a density functional/basis set combination. In our work, “players” are considered to be abstract concepts that are useful for providing the payoffs. We have chosen to represent the various functionals and basis sets as strategies for player 1 and player 2, respectively. The strategies for player 3 are a set of benchmark systems, filtered so as to most closely resemble the system under query. Payoffs for the strategies are determined so as to optimize the efficiency of the game. This means that because neither the basis set nor the functional can be benchmarked separately from the other, a mean absolute percent deviation (MAPD)26 from the reference value is one set of payoffs corresponding to player 2. The payoffs corresponding to player 1 are “complexity scores” calculated for each combination of basis set, functional, and system under query. For player 3, the similarities between the query system and the different benchmark systems are calculated and set as payoffs. The complexity scores for player 1’s payoffs are calculated as combinations of the “complexities” of the system under test, the basis set, and the functional. Each is an attempt to predict the relative timing of a calculation without actually performing one. The three complexities are described as follows: • The system complexity is a count of the number of atoms present in the query system. All calculations scale with the number of atoms present in the system, and therefore, the system complexity is an attempt to scale the overall complexity score accordingly. • The basis set and functional complexities are calculated not by timings but rather by how complex each physical approximation is. For example, a STO-3G basis set has a lower complexity score than a STO-6G or a STO-3G* basis set. Basis set complexities are calculated by counting the number of diffuse basis functions, the number of polarized basis functions, and the number of zeta primitive Gaussian functions and then using a calibrated calculation to come up with a complexity score. • Functional complexities are calculated on the basis of the rungs of Jacob’s ladder, where LDA < GGA < meta-GGA, etc. The complexities are further adjusted by the other user-input parameters. The overall complexity (C) is then defined as C total =

Table 2. Density Functionals and Basis Sets Used in Complexity Calibration rung of Jacob’s ladder LDA GGA

meta-GGA

hybrid GGA

double-hybrid GGA

Figure 1. Decider user interface (UI). The box on the left is where the user draws molecules they want to evaluate, and the box on the right is where the answers are given in list format with their scores.

represented a general sampling of available basis sets, including single-ζ up to quadruple-ζ basis sets as well as polarization and diffuse functions. Density functionals were chosen so that there were at least two representatives of each rung of Jacob’s ladder except for LDA. These calculations were timed, and the timings were plotted so that a line of best fit could be obtained for each parameter: timing per zeta function versus number of atoms, timing per polarization function versus number of atoms, and timing per diffuse function versus number of atoms. These bestfit equations were then used to describe lzeta, lpol, and ldiff, respectively. Interestingly, the “Jacob’s ladder” groupings were well-recovered in the process of investigating the statistics for functionals on different rungs, which is reassuring. The differences among functionals on any given rung of Jacob’s ladder are harder to predict.

C basis = l zeta + l pol + l diff

Table 1. Cfunctional Valuesa

a

LDA GGA meta-GGA hybrid GGA double-hybrid GGA

1 2 3 4 5

6-31G29 6-311G29 def2-SVP33 def2-TZVP33 def2-QZVP33 6-31G*29 6-311G*29 cc-pVDZ38 cc-pVTZ38 cc-pVQZ38 6-31+G29 6-311+G29 6-311+G*29 6-31+G*29 aug-cc-pVDZ43 aug-cc-pVTZ43 aug-cc-pVQZ43

All of the D3 calculations were performed using Grimme’s D3 dispersion correction.45,46

where Csystem is the number of atoms in the query system, Cfunctional is defined in Table 1, and Cbasis is defined below:

Cfunctional

basis sets

SVWN27,28 BP8630−32 BP86-D3 BLYP30,34,35 BLYP-D3 PBE36 PBE-D3 TPSS37 TPSS-D3 B3P8639,30−32 M06L40 M06L-D3 B3LYP39,41 B3LYP-D3 TPSSh42 TPSSh-D3 B2PLYP44 B2PLYP-D3

a

C basis + C functional C system

rung of Jacob’s ladder

density functionalsa

Hartree−Fock was also assigned a Cfunctional value of 1. B

DOI: 10.1021/acs.jcim.7b00542 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling Table 3. Basis Set and Functional Pair NEs, Ranked by Score

a

functional

basis set

scorea

functional

basis set

score

BLYP-D3 SVWN PBE-D3 OPBE-D3 TPSS-D3 mPWLYP-D3 B97-D3 B3LYP-D3 OLYP-D3 BP86-D3 TPSSh-D3 MPW1B95-D3 BMK-D3 PBEsol-D3 PBE0-D3 M06L40 BPBE-D3 rPW86PBE-D3 M06-D3 oTPSS-D3 PW6B9514 revPBE-D3 M052X66 B1B95-D3 BHLYP-D3 XYG369 B3PW91-D3 M05-D3 wB97X-D71 TPSS0-D3 M06L-D3 CAM-B3LYP-D3 MPWB1K-D3 PW6B95-D3 M06HF73 PWPB95-D3 revPBE0-D3 revPBE38-D3 PTPSS7 M062X56 rPW86PBE36,72

def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP

1 0.988 0.755 0.604 0.577 0.571 0.561 0.556 0.546 0.545 0.543 0.536 0.523 0.514 0.505 0.445 0.395 0.387 0.386 0.384 0.384 0.379 0.368 0.362 0.357 0.351 0.35 0.349 0.348 0.347 0.339 0.339 0.336 0.334 0.334 0.334 0.311 0.308 0.275 0.213 0.207

B2PLYP-D3 PBEsol55 DSD-BLYP-D3 B2GPPLYP-D3 M0656 M062X-D3 PTPSS-D3 PWPB957 DSD-BLYP57 LC-wPBE-D3 M0558 BHLYP59 B2GPPLYP60 M052X-D3 MPWB1K61 B2PLYP MPW1B9561 M06HF-D3 B1B9562 PBE063 LC-wPBE64 CAM-B3LYP65 TPSS067 BMK68 TPSSh TPSS revPBE3870 B3LYP revPBE070 oTPSS6 B3PW9139 PBE BPBE36,30 OPBE36,72 revPBE74 mPWLYP34,35,75 BP86 B9776,77 BLYP OLYP34,35,78

def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP def2-QZVP

0.189 0.16 0.159 0.158 0.156 0.154 0.153 0.149 0.147 0.145 0.143 0.141 0.141 0.14 0.138 0.13 0.124 0.115 0.109 0.107 0.091 0.09 0.064 0.062 0.053 0.045 0.03 0.02 0.015 0 0 −0.04 −0.06 −0.16 −0.17 −0.2 −0.46 −0.59 −0.61 −1.2

A value of 1 represents the recommended pair, and all others are weighted accordingly.

Game theory was chosen because, as illustrated by the Prisoner’s Dilemma game, in which a single player’s payoff could be either very high or very low depending on the choice of another player, NEs are determined by the robustness of each player’s strategy. If a player’s strategy works only in a particular situation, it cannot be an NE. Structures similar to the input system on the basis of the Tanimoto coefficient are found in the database, and the NEs are determined on the basis of the payoffs calculated by the Decider’s payoff calculator. Since the only data being used to calculate the NEs are based on similar structures, it can be assumed that the most robust strategy for all of the structures included in the game would give a reasonably accurate answer. With the Decider, this is an asset because the data being used are necessarily incomplete. Therefore, the three-player game has player 1 representing the complexity score, player 2 the accuracy (MAPD), and player 3 the system similarity for the construction of the payoff matrix.

The reference values for the MAPDs, needed both for player 2’s payoffs and for validation, were obtained from the standard S22 benchmark set of Hobza and co-workers.47 The energies used to calculate the MAPDs for the payoffs were obtained from Grimme’s GMTKN30 benchmark data set.7 It is anticipated that the size and diversity of the data set will increase over time, thus improving the predictive ability of our approach in a systematic manner. Therefore, the MAPDs represent the accuracy of a given functional/basis set combination. Each payoff for the third player is set to be the system similarity, i.e., the Tanimoto score48 between each filtered benchmark result and the given molecular system of interest. The Tanimoto score is also used to initially filter and collect all of the results from the data source that are similar enough to the queried molecular system. The scores were computed using the CDK toolkit (version 1.5.8)49,50 with Lucene-based indexing. C

DOI: 10.1021/acs.jcim.7b00542 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling The payoff matrix then contains strategies (functionals, basis sets, and similarities), which store the properties (percent error, complexity, and Tanimoto score, respectively) as payoffs. Once the payoff matrix has been constructed, the NEs are computed using Gambit,51 and the results are then transformed into a more user-friendly score for inspection. Solving for the NEs determines the maximum payoff for each player with respect to the rest of the choices made. In terms of payoffs in the Decider, this means that the most accurate density functional/basis set combination will not always be the top result, nor will the presumed fastest one. In a given game, the possibility of having more than one NE exists, and the larger the game becomes, the more NEs are possible. What this means in a game is that, given an array of choices, players would be equally or similarly satisfied with more than one outcome, as can be seen in the Battle of the Sexes game. Because our game is fairly large, it contains many NEs. This is an asset to us, as we are then able to rank them in order to provide the optimal functional/basis set combination.



VALIDATION The S22 benchmark data set was used to validate our approach.11,26 The parallel benzene dimer, our test molecular system, was excluded from our test data set, leaving 21 systems to be used as a reference for our query. The benzene dimer query was submitted to the Decider user interface (see Figure 1) via a MarvinJS Sketch applet.52 The top five, middle five, and bottom five results of our ranked list were then used in an interaction energy calculation to determine timings and accuracy using Gaussian 1653 and Orca.54 The results of our game-theoretical calculation (Table 3) showed that Grimme’s D3 dispersion-corrected functionals45,46 were almost all rated in the top half of the results set. This is to be expected, as they tend to increase the accuracy of a calculation while adding negligible time. The absence of B3LYP-D3 in the top five results is surprising, as B3LYP is commonly used as a generally applicable functional. It is, however, in the top 10. The largest system under study is the adenine−thymine dimer. This means that the large def2-QZVP basis set is still affordable and therefore is always selected because of the high accuracy achieved with this basis set. The top five functionals in Table 3 are general, accurate, and relatively fast functionals. The inclusion of SVWN is surprising because it is an LDA functional and is therefore considered relatively cheap and inaccurate. However, it performed just as well as the other four in both time taken for the calculation and accuracy, although the accuracy in the benzene dimer case is most likely due to a fortuitous cancellation of errors. The fact that BLYP-D3 is at the top of the list is unsurprising because, along with being a relatively fast functional, it was shown to have the second-lowest MAPD value of all functionals tested on the S22 data set in the GMTKN24 paper.6 The bottom five functionals are general and less accurate but also relatively fast. The disparity in their scores can be explained by the fact that the increased accuracy provided by the dispersion correction at minimal cost weighs heavily in the favor of the dispersion-corrected functionals. In the middle, we should see a trade-off between calculation time and accuracy. Figure 2 depicts the results of our validation. As can be seen in Figure 2c, our highest-scoring functionals are actually both the fastest and most accurate. The case of M062X is interesting because according to Figure 2b it has a comparable accuracy to the top five, but it is ranked in the middle of the results. Looking

Figure 2. (a) Decider score vs relative timing. (b) Decider score vs MAPD. It should be noted that the lower the MAPD, the closer the value is to the reference value. (c) Relative timing vs MAPD. rPW86PBE is labeled. Green circles, blue squares, and red diamonds are the top five, middle five, and bottom five answers ranked by the Decider, respectively. Times are relative to the fastest calculation performed. In all three graphs, M062X is circled in blue.

at Figure 2a makes this understandable: it takes 4 times as long to perform the same calculation as for the quickest method tested. Figure 2c shows the relative time versus accuracy (MAPD). For the most part, a trade-off between timing and accuracy can be seen for the functions in the middle of Table 3, as expected. The rPW86PBE functional was the only functional calculated using the Orca package and appears to be an outlier in Figure 2c The concept of “correctness” for a given strategy is challenging to prove because a true answer is not well-defined. While it may be true that a given functional/basis set combination is an optimal trade-off between timing and accuracy, the intrinsic bias of experts in the field needs to be eventually overcome in order for this to be accepted by the community. We sought alternative ways to confirm the validity of the Nash equilibrium strategies and propose a Feigenbaum test, or subject-matter-expert Turing test.79 D

DOI: 10.1021/acs.jcim.7b00542 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Journal of Chemical Information and Modeling

Article



The Turing test80 is a cornerstone of the artificial intelligence community. The original test was devised by Alan Turing, who posed the question “Are there imaginable digital computers which would do well in the imitation game?” to be a way of testing whether or not a machine can think by allowing the machine to participate in a blind question and answer game. If and only if a human judge cannot reliably determine whether the answers are coming from a computer or a human is the machine said to have passed the test. There are now many variants of the Turing test, one of which is the Feigenbaum test. In order to validate our functional selection approach further, we implemented a Feigenbaum test (see Figure 3). The major

CONCLUSION We have developed an objective procedure for selecting density functional and basis set combinations using game theory in combination with existing benchmark data. We anticipate that such a data-driven functional/basis set selection scheme will be of great value to the traditional bench chemist wishing to perform (now routine) density functional theory calculations for a range of studies. We believe in selecting the right functional and basis set combination for the right reason. The Decider is available at http://decider.wallerlab.org.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

Mark P. Waller: 0000-0003-1650-5161 Notes

The authors declare no competing financial interest.

■ ■

ACKNOWLEDGMENTS The Deutsche Forschungsgemeinschaft in the form of SFB858 is gratefully acknowledged. ABBREVIATIONS NE, Nash equilibrium; MAPD, mean absolute percent deviation; LDA, local density approximation; GGA, generalized gradient approximation

Figure 3. Feigenbaum test answer/decision page.

benefit of this test is that one can easily check (with brute force) the validity of the results obtained. A Web site was created where volunteers can register (http://turing.wallerlab.org). Upon registration (see Figure 4), a random assignment to one of two groups is made: a user becomes either a questioner or an answerer.



REFERENCES

(1) Hohenberg, P.; Kohn, W. Inhomogeneous Electron Gas. Phys. Rev. 1964, 136, B864−B871. (2) CCLRC. http://www.cse.scitech.ac.uk/ccg/dft/ (accessed Sept 5, 2017). (3) Mardirossian, N.; Head-Gordon, M. Thirty Years of Density Functional Theory in Computational Chemistry: An Overview and Extensive Assessment of 200 Density Functionals. Mol. Phys. 2017, 115, 2315−2372. (4) Yu, H. S.; He, X.; Li, S. L.; Truhlar, D. G. MN15: A Kohn−Sham Global-Hybrid Exchange−Correlation Density Functional with Broad Accuracy for Multi-Reference and Single-Reference Systems and Noncovalent Interactions. Chem. Sci. 2016, 7, 5032−5051. (5) Korth, M.; Grimme, S. Mindless” DFT Benchmarking. J. Chem. Theory Comput. 2009, 5, 993−1003. (6) Goerigk, L.; Grimme, S. A General Database for Main Group Thermochemistry, Kinetics, and Noncovalent Interactions − Assessment of Common and Reparameterized (meta-)GGA Density Functionals. J. Chem. Theory Comput. 2010, 6, 107−126. (7) Goerigk, L.; Grimme, S. Efficient and Accurate Double-HybridMeta-GGA Density FunctionalsEvaluation with the Extended GMTKN30 Database for General Main Group Thermochemistry, Kinetics, and Noncovalent Interactions. J. Chem. Theory Comput. 2011, 7, 291−309. (8)https://www.chemie.uni-bonn.de/pctc/mulliken-center/software/ GMTKN/gmtkn (9) Goerigk, L.; Hansen, A.; Bauer, C. A.; Ehrlich, S.; Najibi, A.; Grimme, S. Phys. Chem. Chem. Phys. 2017, DOI: 10.1039/ C7CP04913G. (10) Mardirossian, N.; Head-Gordon, M. Note: The Performance of New Density Functionals for a Recent Blind Test of Non-Covalent Interactions. J. Chem. Phys. 2016, 145, 186101. (11) Taylor, D. E.; Á ngyán, J. T.; Galli, G.; Zhang, C.; Gygi, F.; Hirao, K.; Song, J. W.; Rahul, K.; von Lilienfeld, O. A.; Podeszwa, R.; Bulik, I. W.; Henderson, T. M.; Scuseria, G. E.; Toulouse, J.; Peverati, R.; Truhlar, D. G.; Szalewicz, K. Blind Test of Density-Functional-Based

Figure 4. Feigenbaum test registration page.

Volunteers who are assigned the role of questioner are requested to sketch a dimer of their choosing into a browserbased molecular sketcher. The flexibility of being able to submit any chemical dimer makes sure that the set of test molecules is not tailor-made to bias the results. After the question is submitted, the questioner must wait for a response. The task of the sentient answerer is to provide their opinion on the optimal functional and basis set combination for the interaction energy of the dimer that they receive via an e-mail. The most important metric for the Feigenbaum test is whether the questioners are able to reliably identify answers coming from the Decider. The Feigenbaum test is now open for registration, and participation from computational chemists is encouraged. E

DOI: 10.1021/acs.jcim.7b00542 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling Methods on Intermolecular Interaction Energies. J. Chem. Phys. 2016, 145, 124105. (12) Jensen, S. R.; Saha, S.; Flores-Livas, J. A.; Huhn, W.; Blum, V.; Goedecker, S.; Frediani, L. The Elephant in the Room of Density Functional Theory Calculations. J. Phys. Chem. Lett. 2017, 8, 1449− 1457. (13) Karton, A.; Tarnopolsky, A.; Lamere, J.-F.; Schatz, G. C.; Martin, M. L. J. Highly Accurate First-Principles Benchmark Data Sets for the Parametrization and Validation of Density Functional and Other Approximate Methods. Derivation of a Robust, Generally Applicable, Double-Hybrid Functional for Thermochemistry and Thermochemical Kinetics. J. Phys. Chem. A 2008, 112, 12868−12886. (14) Zhao, Y.; Truhlar, D. G. Design of Density Functionals That Are Broadly Accurate for Thermochemistry, Thermochemical Kinetics, and Nonbonded Interactions. J. Phys. Chem. A 2005, 109, 5656−5667. (15) Perdew, J. P.; Schmidt, K. Jacob’s Ladder of Density Functional Approximations for the Exchange-Correlation Energy. AIP Conf. Proc. 2000, 577, 1−20. (16) Jensen, F. How Large is the Elephant in the Density Functional Theory Room? J. Phys. Chem. A 2017, 121, 6104−6107. (17) Goerigk, L. How Do DFT-DCP, DFT-NL, and DFT-D3 Compare for the Description of London-Dispersion Effects in Conformers and General Thermochemistry? J. Chem. Theory Comput. 2014, 10, 968−980. (18) Kruse, H.; Goerigk, L.; Grimme, S. Why the Standard B3LYP/631G* Model Chemistry Should Not Be Used in DFT Calculations of Molecular Thermochemistry: Understanding and Correcting the Problem. J. Org. Chem. 2012, 77, 10824−10834. (19) Osborne, M. J.; Rubinstein, A. A Course in Game Theory; The MIT Press: Cambridge, MA, 1994. (20) Liao, D.; Tlsty, T. D. Evolutionary Game Theory for Physical and Biological Scientists. I. Training and Validating Population Dynamics Equations. Interface Focus 2014, 4, 20140037. (21) Boudard, M.; Bernauer, J.; Barth, D.; Cohen, J.; Denise, A. GARN: Sampling RNA 3D Structure Space with Game Theory and KnowledgeBased Scoring Strategies. PLoS One 2015, 10 (8), e0136444. (22) Pfeiffer, T.; Schuster, S. Game-Theoretical Approaches to Studying the Evolution of Biochemical Systems. Trends Biochem. Sci. 2005, 30, 20−25. (23) Yeates, J. A. M.; Hilbe, C.; Zwick, M.; Nowak, M. A.; Lehman, N. Dynamics of Prebiotic RNA Reproduction Illuminated by Chemical Game Theory. Proc. Natl. Acad. Sci. U. S. A. 2016, 113, 5030−5035. (24) Stöckelhuber, K. W.; Wießner, S.; Das, A.; Heinrich, G. Filler Flocculation in Polymers − a Simplified Model Derived from Thermodynamics and Game Theory. Soft Matter 2017, 13, 3701−3709. (25) Veloz, T.; Razeto-Barry, P.; Dittrich, P.; Fajardo, A. Reaction Networks and Evolutionary Game Theory. J. Math. Biol. 2014, 68, 181− 206. (26) Kumbhar, S.; Fischer, F. D.; Waller, M. P. Assessment of Weak Intermolecular Interactions Across QM/MM Noncovalent Boundaries. J. Chem. Inf. Model. 2012, 52, 93−98. (27) Slater, J. C. A Simplification of the Hartree-Fock Method. Phys. Rev. 1951, 81, 385−390. (28) Vosko, S. H.; Wilk, L.; Nusair, M. Accurate Spin-Dependent Electron Liquid Correlation Energies for Local Spin Density Calculations: a Critical Analysis. Can. J. Phys. 1980, 58, 1200−1211. (29) Hehre, W. J.; Ditchfield, R.; Pople, J. A. Self-Consistent Molecular Orbital Methods. XII. Further Extensions of Gaussian-Type Basis Sets for Use in Molecular Orbital Studies of Organic Molecules. J. Chem. Phys. 1972, 56, 2257−2261. (30) Becke, A. D. Density-Functional Exchange-Energy Approximation with Correct Asymptotic Behavior. Phys. Rev. A: At., Mol., Opt. Phys. 1988, 38, 3098−3100. (31) Perdew, J. P. Density-Functional Approximation for the Correlation Energy of the Inhomogeneous Electron Gas. Phys. Rev. B: Condens. Matter Mater. Phys. 1986, 33, 8822−8824. (32) Perdew, J. P. Erratum: Density-Functional Approximation for the Correlation Energy of the Inhomogeneous Electron Gas. Phys. Rev. B: Condens. Matter Mater. Phys. 1986, 34, 7406.

(33) Weigend, F.; Ahlrichs, R. Balanced Basis Sets of Split Valence, Triple Zeta Valence and Quadruple Zeta Valence Quality for H to Rn: Design and Assessment of Accuracy. Phys. Chem. Chem. Phys. 2005, 7, 3297−3305. (34) Lee, C.; Yang, W.; Parr, R. G. Development of the Colle-Salvetti Correlation-Energy Formula into a Functional of the Electron Density. Phys. Rev. B: Condens. Matter Mater. Phys. 1988, 37, 785−789. (35) Miehlich, B.; Savin, A.; Stoll, H.; Preuss, H. Results Obtained with the Correlation Energy Density Functionals of Becke and Lee, Yang and Parr. Chem. Phys. Lett. 1989, 157, 200−206. (36) Perdew, J. P.; Burke, K.; Ernzerhof, M. Generalized Gradient Approximation Made Simple. Phys. Rev. Lett. 1996, 77, 3865−3868. (37) Tao, J.; Perdew, J. P.; Staroverov, V. N.; Scuseria, G. E. Climbing the Density Functional Ladder: Nonempirical Meta-Generalized Gradient Approximation Designed for Molecules and Solids. Phys. Rev. Lett. 2003, 91, 146401. (38) Dunning, T. H. Gaussian Basis Sets for Use in Correlated Molecular Calculations. I. The Atoms Boron through Neon and Hydrogen. J. Chem. Phys. 1989, 90, 1007−1023. (39) Becke, A. D. Density-functional thermochemistry. III. The role of exact exchange. J. Chem. Phys. 1993, 98, 5648−5652. (40) Zhao, Y.; Truhlar, D. G. A New Local Density Functional for Main-Group Thermochemistry, Transition Metal Bonding, Thermochemical Kinetics, and Noncovalent Interactions. J. Chem. Phys. 2006, 125, 194101. (41) Stephens, P. J.; Devlin, F. J.; Chabalowski, C. F.; Frisch, M. J. Ab Initio Calculation of Vibrational Absorption and Circular Dichroism Spectra Using Density Functional Force Fields. J. Phys. Chem. 1994, 98, 11623−11627. (42) Staroverov, V. N.; Scuseria, G. E.; Tao, J.; Perdew, J. P. Comparative Assessment of a New Nonempirical Density Functional: Molecules and Hydrogen-Bonded Complexes. J. Chem. Phys. 2003, 119, 12129−12137. (43) Kendall, R. A.; Dunning, T. H.; Harrison, R. J. Electron Affinities of the First-Row Atoms Revisited. Systematic Basis Sets and Wave Functions. J. Chem. Phys. 1992, 96, 6796−6806. (44) Grimme, S. Semiempirical Hybrid Density Functional with Perturbative Second-Order Correlation. J. Chem. Phys. 2006, 124, 034108. (45) Grimme, S. Accurate Description of van der Waals Complexes by Density Functional Theory including Empirical Corrections. J. Comput. Chem. 2004, 25, 1463−1473. (46) Grimme, S. Semiempirical Hybrid Density Functional with Perturbative Second-Order Correlation. J. Chem. Phys. 2006, 124, 034108−034116. (47) Jurecka, P.; Sponer, J.; Cerny, J.; Hobza, P. Benchmark Database of Accurate (MP2 and CCSD(T) Complete Basis Set Limit) Interaction Energies of Small Model Complexes, DNA Base Pairs, and Amino Acid Pairs. Phys. Chem. Chem. Phys. 2006, 8, 1985−1993. (48) Chemoinformatics: A Textbook; Gasteiger, J., Engel, T., Eds.; Wiley-VCH: Weinheim, Germany, 2003. (49) Steinbeck, C.; Han, Y.; Kuhn, S.; Horlacher, O.; Luttmann, E.; Willighagen, E. The Chemistry Development Kit (CDK): An OpenSource Java Library for Chemo- and Bioinformatics. J. Chem. Inf. Comput. Sci. 2003, 43, 493−500. (50) Willighagen, E. L.; Mayfield, J. W.; Alvarsson, J.; Berg, A.; Carlsson, L.; Jeliazkova, N.; Kuhn, S.; Pluskal, T.; Rojas-Chertó, M.; Spjuth, O.; Torrance, G.; Evelo, C. T.; Guha, R.; Steinbeck, C. The Chemistry Development Kit (CDK) v2.0: Atom Typing, Depiction, Molecular Formulas, and Substructure Searching. J. Cheminf. 2017, 9, 33. (51) McKelvey, R. D.; McLennan, A. M.; Turocy, T. L. Gambit: Software Tools for Game Theory (version 14.1.0, 2014). http://www. gambit-project.org (accessed Sept 5, 2017). (52) MarvinJS, version 6.2.2; ChemAxon: Budapest, Hungary, 2014; http://www.chemaxon.com/products/marvin/. (53) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Mennucci, B.; Petersson, G. A.; Nakatsuji, H.; Caricato, M.; Li, X.; Hratchian, H. P.; F

DOI: 10.1021/acs.jcim.7b00542 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling Izmaylov, A. F.; Bloino, J.; Zheng, G.; Sonnenberg, J. L.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Montgomery, J. A., Jr.; Peralta, J. E.; Ogliaro, F.; Bearpark, M.; Heyd, J. J.; Brothers, E.; Kudin, K. N.; Staroverov, V. N.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Rega, N.; Millam, J. M.; Klene, M.; Knox, J. E.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Zakrzewski, V. G.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Dapprich, S.; Daniels, A. D.; Farkas, Ö .; Foresman, J. B.; Ortiz, J. V.; Cioslowski, J.; Fox, D. J. Gaussian 09, Revision E.01, Gaussian, Inc., Wallingford CT, 2009. (54) Neese, F. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2012, 2, 73−78. (55) Perdew, J. P.; Ruzsinszky, A.; Csonka, G. I.; Vydrov, O. A.; Scuseria, G. E.; Constantin, L. A.; Zhou, X.; Burke, K. Restoring the Density-Gradient Expansion for Exchange in Solids and Surfaces. Phys. Rev. Lett. 2008, 100, 136406. (56) Zhao, Y.; Truhlar, D. G. The M06 Suite of Density Functionals for Main Group Thermochemistry, Thermochemical Kinetics, Noncovalent Interactions, Excited States, and Transition Elements: Two New Functionals and Systematic Testing of Four M06-Class Functionals and 12 Other Functionals. Theor. Chem. Acc. 2008, 120, 215− 241. (57) Kozuch, S.; Gruzman, D.; Martin, J. M. L. DSD-BLYP: A General Purpose Double Hybrid Density Functional Including Spin Component Scaling and Dispersion Correction. J. Phys. Chem. C 2010, 114, 20801− 20808. (58) Zhao, Y.; Schultz, N. E.; Truhlar, D. G. Exchange-Correlation Functional with Broad Accuracy for Metallic and Nonmetallic Compounds, Kinetics, and Noncovalent Interactions. J. Chem. Phys. 2005, 123, 161103. (59) Becke, A. D. A New Mixing of Hartree-Fock and Local-DensityFunctional Theories. J. Chem. Phys. 1993, 98, 1372−1377. (60) Karton, A.; Tarnopolsky, A.; Lamere, J. F.; Schatz, G. C.; Martin, J. M. L. Highly Accurate First-Principles Benchmark Data Sets for the Parametrization and Validation of Density Functional and Other Approximate Methods. Derivation of a Robust, Generally Applicable, Double-Hybrid Functional for Thermochemistry and Thermochemical Kinetics. J. Phys. Chem. A 2008, 112, 12868−12886. (61) Zhao, Y.; Truhlar, D. G. Hybrid Meta Density Functional Theory Methods for Thermochemistry, Thermochemical Kinetics, and Noncovalent Interactions: The MPW1B95 and MPWB1K Models and Comparative Assessments for Hydrogen Bonding and van der Waals Interactions. J. Phys. Chem. A 2004, 108, 6908−6918. (62) Becke, A. D. Density-Functional Thermochemistry. IV. A New Dynamic Correlation Functional and Implications for Exact-Exchange Mixing. J. Chem. Phys. 1996, 104, 1040−1046. (63) Adamo, C.; Barone, V. Toward Reliable Density Functional Methods without Adjustable Parameters: the PBE0Model. J. Chem. Phys. 1999, 110, 6158−6170. (64) Vydrov, O. A.; Scuseria, G. E. Assessment of a Long-Range Corrected Hybrid Functional. J. Chem. Phys. 2006, 125, 234109. (65) Yanai, T.; Tew, D. P.; Handy, N. C. A New Hybrid ExchangeCorrelation Functional using the Coulomb-Attenuating Method (CAM-B3LYP). Chem. Phys. Lett. 2004, 393, 51−57. (66) Zhao, Y.; Schultz, N. E.; Truhlar, D. G. Design of Density Functionals by Combining the Method of Constraint Satisfaction with Parametrization for Thermochemistry, Thermochemical Kinetics, and Noncovalent Interactions. J. Chem. Theory Comput. 2006, 2, 364−382. (67) Grimme, S. Accurate Calculation of the Heats of Formation for Large Main Group Compounds with Spin-Component Scaled MP2Methods. J. Phys. Chem. A 2005, 109, 3067−3077. (68) Boese, A. D.; Martin, J. M. L. Development of Density Functionals for Thermochemical Kinetics. J. Chem. Phys. 2004, 121, 3405−3416. (69) Zhang, Y.; Xu, X.; Goddard, W. A. Doubly Hybrid Density Functional for Accurate Descriptions of Nonbond Interactions, Thermochemistry, and Thermochemical Kinetics. Proc. Natl. Acad. Sci. U. S. A. 2009, 106, 4963−4968.

(70) Goerigk, L.; Grimme, S. A Thorough Benchmark of Density Functional Methods for General Main Group Thermochemistry, Kinetics, and Noncovalent Interactions. Phys. Chem. Chem. Phys. 2011, 13, 6670−6688. (71) Chai, J.-D.; Head-Gordon, M. Long-Range Corrected Hybrid Density Functionals with Damped Atom-Atom Dispersion Corrections. Phys. Chem. Chem. Phys. 2008, 10, 6615−6620. (72) Murray, E. D.; Lee, K.; Langreth, D. C. Investigation of Exchange Energy Density Functional Accuracy for Interacting Molecules. J. Chem. Theory Comput. 2009, 5, 2754−2762. (73) Zhao, Y.; Truhlar, D. G. Density Functional for Spectroscopy: No Long-Range Self-Interaction Error, Good Performance for Rydberg and Charge-Transfer States, and Better Performance on Average than B3LYP for Ground States. J. Phys. Chem. A 2006, 110, 13126−13130. (74) Zhang, Y.; Yang, W. Comment on “Generalized Gradient Approximation Made Simple”. Phys. Rev. Lett. 1998, 80, 890−890. (75) Adamo, C.; Barone, V. Exchange Functionals with Improved Long-Range Behavior and Adiabatic Connection Methods without Adjustable Parameters: the mPW and mPW1PW Models. J. Chem. Phys. 1998, 108, 664−675. (76) Hamprecht, F. A.; Cohen, A. J.; Tozer, D. J.; Handy, N. C. Development and Assessment of New Exchange-Correlation Functionals. J. Chem. Phys. 1998, 109, 6264−6271. (77) Wilson, P. J.; Bradley, T. J.; Tozer, D. J. Hybrid ExchangeCorrelation Functional Determined from Thermochemical Data and ab initio Potentials. J. Chem. Phys. 2001, 115, 9233−9242. (78) Handy, N. C.; Cohen, A. J. Left-Right Correlation Energy. Mol. Phys. 2001, 99, 403−412. (79) Feigenbaum, E. A. Some Challenges and Grand Challenges for Computational Intelligence. J. Assoc. Comput. Mach. 2003, 50, 32−40. (80) Turing, A. M. I.-Computing Machinery and Intelligence. Mind 1950, LIX, 433−460.

G

DOI: 10.1021/acs.jcim.7b00542 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX