Application of an Artificial Neural Network to the Prediction of

Dec 18, 2015 - Application of an Artificial Neural Network to the Prediction of OH ... rate constants taken from the NIST Chemical Kinetics Database. ...
0 downloads 0 Views 1MB Size
Article pubs.acs.org/JPCB

Application of an Artificial Neural Network to the Prediction of OH Radical Reaction Rate Constants for Evaluating Global Warming Potential Thomas C. Allison* Chemical Informatics Research Group, Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Drive, Stop 8320, Gaithersburg, Maryland 20899-8320, United States S Supporting Information *

ABSTRACT: Rate constants for reactions of chemical compounds with hydroxyl radical are a key quantity used in evaluating the global warming potential of a substance. Experimental determination of these rate constants is essential, but it can also be difficult and time-consuming to produce. High-level quantum chemistry predictions of the rate constant can suffer from the same issues. Therefore, it is valuable to devise estimation schemes that can give reasonable results on a variety of chemical compounds. In this article, the construction and training of an artificial neural network (ANN) for the prediction of rate constants at 298 K for reactions of hydroxyl radical with a diverse set of molecules is described. Input to the ANN consists of counts of the chemical bonds and bends present in the target molecule. The ANN is trained using 792 •OH reaction rate constants taken from the NIST Chemical Kinetics Database. The mean unsigned percent error (MUPE) for the training set is 12%, and the MUPE of the testing set is 51%. It is shown that the present methodology yields rate constants of reasonable accuracy for a diverse set of inputs. The results are compared to high-quality literature values and to another estimation scheme. This ANN methodology is expected to be of use in a wide range of applications for which •OH reaction rate constants are required. The model uses only information that can be gathered from a 2D representation of the molecule, making the present approach particularly appealing, especially for screening applications.



atmospheric lifetime τ is used to determine the function describing the time-dependent abundance of substance i

INTRODUCTION The global warming potential (GWP) is one of the primary measures of the impact of the release of a particular substance into the atmosphere. In order to compute the GWP, knowledge of the radiative forcing (RF) or radiative efficiency (RE) and the atmospheric lifetime is required. The GWP index is “based on the time-integrated global mean RF of a pulse emission of 1 kg of some compound (i) relative to that of 1 kg of the reference gas (CO2)”.1 The GWP index is defined as1 GWPi =

TH

∫0 RFr (t ) dt

=

∫0 ai[Ci(t )] dt

τi = τCH3CCl3

TH

∫0 ar [Cr(t )] dt

(1)

k CH3CCl3(277 K) ki(277 K)

(3)

where kCH3CCl3(277 K) and ki(277 K) represent the rate constants at 277 K for the reaction of •OH with CH3CCl3 and substance i, respectively.

where the subscript i refers to the substance whose GWP is being calculated, the subscript r refers to a reference gas (CO2), RFi(t) is the global mean radiative forcing of substance i, t is the time (typically measured in years), and TH is the time horizon over which the GWP is evaluated. In the numerator on the right-hand side of the preceding equation, ai is the RF per unit mass increase in atmospheric abundance of component i (this is the RE), and Ci(t) is the time-dependent abundance of i. The © XXXX American Chemical Society

(2)

where Ci(t = 0) is the initial concentration of substance i, and the decrease in the concentration of substance i is exponential. The value of the atmospheric lifetime, τ, depends on the rate constant for the reaction of the substance i with •OH and is determined relative to the rate constant and atmospheric lifetime for methyl chloroform, CH3CCl3

TH

TH

∫0 RF(i t ) dt

Ci(t ) = Ci(t = 0) e−t/ τi

Special Issue: Bruce C. Garrett Festschrift Received: September 30, 2015 Revised: December 10, 2015

A

DOI: 10.1021/acs.jpcb.5b09558 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

estimating rate constants for hydrogen abstraction reactions with •OH for hydrofluorocarbons and hydrofluoroethers was also introduced by Chandra et al.18 Urata et al.19,20 used an ANN approach to estimate the bond dissociation enthalpy for H atom abstraction by •OH and then applied the methodology of Heicklen14 to estimate rate constants. This is one of a few examples of the use of an ANN to predict •OH reaction rate constants. A correlation between •OH reaction rate constants and vertical ionization energies was found by Güsten,21 who used it to study tropospheric degradation of organic compounds. In a significant review article, Atkinson22 surveyed a number of correlation schemes for estimation of rate constants. Cohen and Benson23 reported an empirical correlation leading to a “universal” expression for the rate constants for reactions of • OH with haloalkanes. Their expression used only the reactant molecule mass and the number of extractable hydrogens and was able to predict rate constants within a factor of 3 of their experimental values. Bartolotti and Edney24 used Hartree−Fock calculations of highest occupied molecular orbital (HOMO) energies and investigated the correlation with the logarithm of the •OH rate constant of hydrofluorocarbons and hydrofluoroethers. The correlation of the ionization potential of the reactant with the activation energy, the pre-exponential A factor, and the logarithm of the room temperature rate constant for •OH reactions was studied by Percival et al.25 Finally, DeMore26 showed that the Arrhenius parameters (A factor and activation energy) for reaction with •OH could be predicted from the rate constant at 298 K for compounds with a single type of C−H bond. Group Additivity. Group additivity models are based on the pioneering work of Benson,27 who developed the approach for the purpose of estimating thermochemical properties. This model is based on a group additivity approach due to DeMore28 to estimate the logarithm of the rate constant per H atom. A group additivity based scheme for estimating rate constants for the abstraction of H atoms from alkanes by H and CH3 was reported by Sumathi et al.,29 who used a variety of ab initio calculations to parametrize their model. In a recent article, Kazakov et al.30 have discussed a number of considerations in estimating GWPs. To provide an estimate of the atmospheric lifetime, the authors used the method due to Atkinson22,31,32 with modifications to correct defects in results for fluorinated and chlorinated ethylenes. (These authors also considered a method due to Klamt33,34 and extended by Böhnhardt35,36 and found that the results were similar to those of the Atkinson model.) Structure−Activity Relationships. Structure−activity relationship (SAR) models (also described as quantitative structure−activity relationships, QSAR, quantitative structure−property relationships, QSPR, and structure−reactivity relationships, SRR) have been widely applied to predict a number of chemical phenomena, including reaction rate constants. Atkinson31 used a SAR to estimate •OH reaction rate constants with organic compounds, especially S- and Ncontaining molecules, over a temperature range of 250−1000 K. Another SAR was introduced by Atkinson37 for estimating rate constants of •OH with organic compounds. Wallington et al.38 studied C5−C7 aliphatic alcohols and ethers experimentally and compared these values to those from an SRR model. Tosato et al.39 used a QSAR model to predict reaction rate constants of •OH with haloalkanes and extended the model to

There are, of course, a number of sources of RF values, including the Intergovernmental Panel on Climate Change (IPCC) Fourth Assessment Report1 (AR4) and a recent survey paper by Hodnebrog et al.2 Values of the RF are typically derived from measurement of the infrared spectrum of the compound.3 Quantum chemistry calculations have also proven to be valuable in the estimation of RFs.4,5 The purpose of this article is to describe the application of an artificial neural network (ANN) to the prediction of rate constants at T = 298 K for reactions with OH radicals k(T )

reactant + •OH ⎯⎯⎯→ products

(4)

There are a variety of means of determining a rate constant for reaction with •OH. Experimental techniques are generally the most reliable, with direct measurement of the absolute rate constant preferred over measurement of the rate constant relative to another species. As will be discussed below, it is not unusual for authors of experimental work to create empirical models to estimate rate constants for molecules similar to those they have measured. There are also a number of empirical estimation techniques. Quantum chemistry methods may also be used to determine •OH reaction rate constants. Literature Review. A variety of estimation techniques are available. For the purposes of the present discussion, these are divided into several categories: correlations, group additivity, structure−activity relationships, approaches based on transition state theory, and quantum chemistry models. Correlations. Empirical relationships such as the one developed by Evans and Polanyi6 have been used with considerable success. The method assumes that the activation energy follows a linear relationship within a family of molecules and allows estimates of rate constants via the Arrhenius equation. Gaffney and Levine7 used linear free energy relationships (LFER) to develop expressions for the rates of O(3P) and •OH addition and abstraction reactions with organic molecules. Grosjean and Williams8 used LFER to predict rate constants for reactions of unsaturated aliphatics. The Evans−Polanyi relationship and other correlations were used by Khan et al.9 to predict rate constants for an automatic mechanism generation scheme with application to the study of the atmospheric chemistry of volatile organic compounds. Güsten10 used correlations with rates of reaction in solution to predict rate constants for reactions of •OH with organic compounds. Similarly, Dilling et al.11 estimated gas phase reaction rates of •OH with organic compounds by considering their relative reaction rates with hydrogen peroxide in 1,1,2trichlorotrifluoroethane solution. Experimental data on monoolefins was used by Ohta12 to estimate rate constants for reactions of •OH with diolefins in the gas phase. A correlation between the average 13C and 1H nuclear magnetic resonance (NMR) chemical shifts and the rate constants for reactions of organic compounds with •OH was explored by Hodson.13 Correlations of rate constants with bond dissociation enthalpies (BDEs) were used by Heicklen14 to predict rate constants for H atom abstraction by •OH. Similarly, Jolly et al.15 studied the correlation of •OH reaction rate constants with BDEs of three cyclic alkanes. Chandra et al. have studied the correlation between BDEs and activation energies using density functional theory (DFT) calculations on haloalkanes16 and haloethers.17 A method for B

DOI: 10.1021/acs.jpcb.5b09558 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

Figure 1. Depiction of a feed-forward artificial neural network with input neurons depicted on the left-hand side of the figure, three hidden layers (gray) consisting of five, four, and three neurons, respectively, depicted in the middle of the diagram, and a single output neuron on the right-hand side. Connections between neurons are depicted as arrows.

and Niu48 created QSAR/QSPR models for estimating rate constants for reactions of • Cl, • OH, and NO 3 with alkylnaphthalenes. Wang et al.49 produced a validated QSAR model for estimating the rate constants for the reaction of •OH with atmospheric pollutants. Reactions in which a H atom is abstracted from an alkane by •OH were studied by Huang et al.50 using a QSAR approach based on descriptors calculated using DFT. They also used a genetic algorithm to select inputs for a support vector machine algorithm, which was used to predict rate constants with acceptable accuracy. Finally, Poutsma has recently studied structure−reactivity correlations for reactions in which a H atom is abstracted by •OH51 and • Cl.52 Approaches Based on Transition State Theory. Leroy and Sana53 estimated activation barriers for atom transfer reactions using Morse potentials and used the results to compute reaction rate constants using transition state theory (TST) with an Eckart tunneling correction. Cohen and Benson54 used a TST-based extrapolation approach to study reactions of alkanes with O and •OH as an extension of an earlier work in which reactions of alkanes with •OH were considered.55 Cohen also used TST to study whether rate constants for •OH were additive.56 Xing et al.57 compared experimental results for rate constants of •OH with four chloroor bromo-alkanes with the results from a semiempirical bond energy-bond order (BEBO) transition state method. Donahue et al.58 developed a theory of radical molecule reactions and used it to predict barrier heights. Finally, Huynh et al.59 applied reaction class TST to the study of reactions of •OH with alkanes, stating that the methodology permitted efficient evaluation of all reactions of this type. Quantum Chemistry Models. Approaches based on variational transition state theory (VTST),60 particularly those that incorporate a semiclassical tunneling approximation such as the small curvature tunneling (SCT) approximation,61,62 have been shown to yield good estimates of rate constants. Another interesting application of quantum chemistry to the prediction of •OH reaction rate constants is due to Klamt,33,34 who introduced a method called MOOH based on semi-

estimate the same rate constants for haloalkanes, aliphatic alcohols, ketones, and aldehydes. A SRR was used by Grosjean and Williams8 to estimate rate constants for reaction with •OH for 150 unsaturated aliphatic compounds. Among the various methods used to estimate •OH reaction rate constants, the model due to Atkinson22,31,32 is perhaps the most widely applied. In this model, it is recognized that four unique reaction pathways, abstraction of a H atom from C−H and O−H bonds, addition of •OH to >CC< and −CC−, addition of •OH to aromatic rings, and interaction of •OH with N, S, and P atoms, are the dominant contributors to the reaction rate constant of a molecule reacting with •OH. By parametrizing this model with nearly 500 reaction rate constants, the authors were able to fit ≈90% of the rate constants within a factor of 2 of their experimental values. Importantly, the authors issue a strong caution against using this model outside of the database with which it was fit.32 This model was used by Meylan and Howard40 in the Atmospheric Oxidation Program (AOP) computer software. Güsten41 has given an overview of the application of QSAR methods for predicting rate constants for reactions of atmospheric organic pollutants with •OH and NO3. Gramatica and co-workers42 used a variety of techniques including QSAR/ QSPR to predict rate constants for reactions of •OH and NO3 with organic compounds in the troposphere. Neeb43 revisited the earlier SAR of Atkinson, finding that better results could be produced for a number of compounds. Alvarez-Idaboy et al.44 used quantum chemistry calculations and a transition state theory model to identify an SRR for reactions of ketones with • OH. Tropospheric degradation of VOCs was studied by Gramatica et al.,45 who constructed a QSAR model and used it to predict 460 •OH reaction rate constants. The authors validated their QSAR model using a Kohonen ANN approach to split their data set into training and validation sets, but they did not use an ANN approach to predict rate constants. Ö berg46 developed a QSAR for estimation of •OH reaction rate constants and applied it to a large number of compounds. A SAR for estimation for rate constants of •OH with several alkanes and cycloalkanes was created by Wilson et al.47 Long C

DOI: 10.1021/acs.jpcb.5b09558 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

also be preferable to predict rate constants at more than one temperature, or over a range of temperatures, perhaps via prediction of a suitable set of Arrhenius parameters. This approach adds a great deal of complexity to the problem and is thus beyond the scope of the present article. A significant factor in the development of the present model is computational cost. Although approaches based on quantum chemistry calculations have been shown to produce good results, in the present article a method is sought that requires only information that can be gathered from a 2D representation (i.e., only information on connectivity, not values of bond lengths or bending angles) of the molecule of interest. This approach requires minimal expense to create suitable input for rate constant prediction and is consistent with a great deal of previous work described above in which estimates of rate constants were based exclusively on chemical groups or chemical descriptors. Given that the ANN, once trained, may be evaluated rapidly, it is expected that the model described in this article will be appropriate for screening applications as well as for implementation in stand-alone software and on Web sites.

empirical quantum chemistry calculations in which a rate constant per H atom in a molecule was derived. Klamt’s work has been extended by Böhnhardt et al.35,36 with a parametrization over a larger set of compounds. The authors state the this model is superior to the approach of Atkinson,32 though the recent article by Kazakov et al.30 did not find the difference to be dramatic. Artificial Neural Networks. Artificial neural networks (ANNs) have been applied to a wide range of problems. Oftencited examples of applications of ANNs include handwriting recognition, computer vision, and speech recognition. ANNs have a reputation for good performance on nonlinear fitting problems. Though a number of details about ANNs are presented here, the interested reader is directed to any number of excellent textbooks on neural networks such as the book by Haykin63 for more information. An illustration of an ANN is presented in Figure 1. In this figure, five input neurons are depicted as squares on the lefthand side, with one input neuron per input to the model. In the center of the figure, the hidden layer is depicted as three layers of gray squares. The bulk of the work of the ANN is performed in the hidden layer. A single output neuron is depicted on the right-hand side of the figure as an oval corresponding to the output of the ANN model. Connections between neurons are represented as lines with arrows in the figure and proceed in a single direction from the left-hand side to the right-hand side of the figure. The ANN depicted in the figure has the form of an acyclic directed graph and is thus called a feed-forward neural network.63 In order for the ANN to make predictions, it must first be trained. Training is accomplished by evaluating the network for a given set of input and output values and using an algorithm to minimize the mean square error (MSE) between the expected and predicted output values. If the MSE is used as a cost function with a gradient descent optimizer in training of a multilayer perceptron (a universal function approximator64) network, then the training method is called backpropagation.63 In the backpropagation procedure, the weightings of the interconnections of the neurons are adjusted to minimize the MSE. Training involves a (potentially large) number of cycles (epochs). Specific details of the ANN used in the present work are given below. Several applications19,20,42 of ANNs to the prediction of •OH reaction rate constants have been described above. However, none of these constitutes a prediction of a reaction rate constant using an ANN. Rather, the applications were indirect. The most interesting application of an artificial intelligence (AI) algorithm mentioned above is found in the recent work of Huang et al.,50 who used a support vector machine (SVM) to predict •OH reaction rate constants. The author is not aware of any other instance in which an ANN or any other AI technique has been applied to the prediction of •OH reaction rate constants. Prediction of Rate Constants. In the present article, the application of an ANN to the problem of predicting •OH reaction rate constants is described, with a particular interest in predicting these for molecules of atmospheric interest. Although it would be ideal to predict these rate constants at 277 K, the temperature at which (or near which) the atmospheric lifetime is generally evaluated, there are considerably fewer experimental rate constants available at this temperature. Therefore, the convention of using reaction rate constants at 298 K is followed in the present article. It would



METHODS Data Set. A set of rate constants for chemical reactions with • OH at (or near, within 5 K of) 298 K was gathered from the NIST Chemical Kinetics Database.65,66 This database contains more than 8000 determinations of •OH reaction rate constants from review, experiment, and theory. Allowing only a single rate constant per unique reaction reduces the size of the data set considerably. In cases where multiple determinations of a rate constant had been made, the average and standard deviation of the logarithm (base 10) of the rate constant at 298 K were computed for the full set of determinations. Rate constants within one standard deviation of the mean were retained in the set, and a consensus rate constant value was computed as the average value of this set. Molecules in the data set were restricted to those containing no atoms other than H, C, N, O, S, F, Cl, and Br. Molecules had to have at least four atoms, singlet spin multiplicity (i.e., closed shell), and at least one H atom and one C atom to be included in the data set. Chemical structures (represented as 2D “stick figures”) were taken from the NIST Chemistry WebBook67 wherever possible, and the remainder were generated by hand or by name-tostructure conversion performed using the ChemDraw software package. [Certain commercial equipment, instruments, or materials are identified in this article in order to specify the experimental procedure adequately. Such identification is not intended to imply recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that the materials or equipment identified are necessarily the best available for the purpose.] These structures contain information on the atoms present in the molecule and their connectivity. Note that an optimized 3D structure of the molecule is not needed. This data forms the basis of the model. Data Model. In order to create an ANN, it is necessary to define a suitable representation of the data for use as input neurons. For the present study, an algorithm was developed that identified the set of chemical bonds and bends present in the data set using functionality contained in the OpenBabel chemical informatics package.68,69 Using a 2D representation (MOL file) of the molecule or the Cartesian coordinates, the OpenBabel package is capable of recognizing chemically D

DOI: 10.1021/acs.jpcb.5b09558 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

constant data and spreads it more evenly within the required interval. This function was found to produce acceptable results more often than other functions and was thus used. Artificial Neural Network. The number of input neurons is fixed by the number of chemical descriptors in the input data set (109), and the number of output neurons was one. Most of the work of the ANN is performed in the hidden layer, which contained three layers containing 83, 55, and 28 neurons, respectively. Different combinations of neurons in the hidden layer and different numbers of layers within the hidden layer were tested and found to differ very little from the arrangement used here. The Fast Artificial Neural Network (FANN) software was used to construct, train, and validate the ANN. The technical details of the ANN are now given. Some of these are specific to the FANN package,70 whereas many are commonly used in most or all ANN software packages. The neurons in the hidden layer used the Elliott activation function.71 This function is similar to the sigmoidal function that is commonly used, but it is computationally cheaper to evaluate. An activation steepness (a numerical value that determines the speed of learning) of 0.7 was used for these neurons. The output layer used a linear activation function. The training algorithm made use of a hyperbolic tangent error function and minimized the number of bit failures (predicted values that deviated from the training value more than some tolerance). This mode of training was found to be superior to minimization of the MSE below some tolerance. Initial values for the weights of the interconnections in the ANN were generated using the procedure due to Nguyen and Widrow.72 It was found that trial ANNs that had lower MSEs after initialization tended to produce fits with lower overall MSEs and with fewer bit failures, so a procedure in which the initialization procedure was repeated until the MSE was less than 0.6 was used. Training was accomplished through use of the RPROP algorithm due to Riedmiller and Braun.73 Training. Training was carried out for no more than 6000 epochs to avoid problems with overfitting, a situation in which an ANN can fit the training data very well but performs poorly on data not used in training. It was generally possible to reduce the number of bit failures to a small number, but not to zero. The backpropagation training method uses the MSE. In the present application, this is defined as

bonded atoms and assigning types to atoms (e.g., aromatic C) and bonds (e.g., double bond). A list of bends is assembled from the list of bonded atoms. The full set of bonds and bends was included in an initial input set. Single, double, and triple bonds were recognized and used in the input data set. Similarly, atoms in a ring were distinguished, as were D atoms and aromatic C atoms. The input to the ANN is simply a count of each type of input present in the molecule. For example, in the methane molecule, the input would consist of four C−H bonds and six (unique) H−C−H bends. Note that the values of bond lengths and bond angles are not used, just the number of these that occur in the molecule. Given the diversity of chemical bonds and bending angles present in the molecules in the data set, a large number of input parameters (164) was generated. Given the small size of the full data set (791), it is desirable to reduce the number of input parameters. Several steps were applied to accomplish this goal. These steps involved reductions in the size of the input parameter set as well as the data set as follows. Input parameters that were not used at least twice or that were used more often in the testing set than in the training set were removed from the input parameter set. Next, molecules that did not have at least four nonzero input parameters (i.e., at least four bonds or bends in the molecule were represented in the input parameter set) were removed from the data set. This procedure was iterated until self-consistent sets of input parameters and molecules were obtained. Typically, this procedure removed a large number of input parameters (≈55) but a small number of molecules (≈2). In this manner, a set of 791 molecules and a set of 109 input parameters were obtained for use in training and testing the ANN. The ANN model uses a single output neuron (a modified value of the rate constant, as will be seen below). Training the ANN involves a minimization of the MSE between the ANN output value and the training data value. One requirement of the ANN software was that the output neuron values lie in the range [0,1]. This is due to the function used to represent the output neuron. Since the rate constants used for training spanned many orders of magnitude, a logarithmic function was used to map these values to a set of values distributed in the required interval. The performance of the ANN depends strongly on the function used for this mapping. After exploring a number of options via a trial-and-error approach, the following procedure was selected to map the rate constant (k) data to its modified form (k′). First, the mean (μk) and standard deviation (σk) of the base 10 logarithm of the set of rate constants were computed. A z-score was computed as z=

MSE =

(5)

where μk = −11.51 and σk = 1.40. The minimum (kmin) and maximum (kmax) values of the set {z} were computed and multiplied by 1.05. This was done so that when the values {z} were mapped to the range [0,1] the extreme values would be avoided (this was found to improve the ANN). The extreme values were kmin = −1.99 and kmax = 1.84. The final set of values used to train and test the ANN were computed as k′ =

z − k min k max − k min

N

′ − k train ′ )2 ∑ (k pred i

(7)

where ktrain is the input value of the rate constant and kpred is the value predicted by the ANN. In order to fit an ANN with the widest range of applicability, the initial training set consisted of all 791 molecules. However, using all of the data makes it difficult to evaluate the reliability of the ANN. Therefore, the data set was divided into training and validation sets, as described in the following section. Validation. Validation of the performance of the ANN was accomplished by dividing the full data set into a training set and a validation set. The data were randomly assigned to these sets such that approximately 90% of the data was in the training set and the remainder was assigned to the validation set. This type of validation is a type of cross-validation known as repeated random subsampling validation and is a good statistical means for assessing how the ANN model will perform in practice. Individual errors ϵ are computed as

log10 k − (log10 k)avg σk

1 N

(6)

(these will be referred to as “modified rate constants” hereafter). This procedure more equally weights the rate E

DOI: 10.1021/acs.jpcb.5b09558 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B ϵ=

k pred − k train k train

(8)

and expressed as a percentage. The primary error metric used to evaluate the overall quality of the fit is the mean unsigned percent error (MUPE) MUPE =

1 N

N

∑ |ϵi| i=1

(9)

Note that this is a more stringent measure of the error as it is the MSE that is used to optimize the ANN, not the MUPE. Furthermore, the logarithmic dependence of the mapping function (see eq 5) and the large number of orders of magnitude spanned by the rate constants used in fitting the ANN (from (10−15 to 10−10) cm3 molecule−1 s−1) make the MUPE especially sensitive to small errors in the values of the mapping function. Thus, the MUPE will serve as a very sensitive measure of performance of the ANN. Also note that the ANN is trained using the modified rate constants (k′; see eq 5), whereas the MUPE is defined using the values of the rate constants (k) themselves. The values of the rate constants may be recovered from the modified rate constant values using



k = 10 σkk ′ (kmax − kmin) + μk

Figure 2. Histogram plot of the mean unsigned percent error of the results predicted by the ANN for the training set.

is clear that the ANN model is quite capable of fitting its training set well. The fit produced by Atkinson was reported to fit 90% of the input values within a factor of 2.32 In the present results, 98% of the data were fit within a factor of 2 of the training value, providing another indication of the quality of the fitting procedure. The robustness of the method is demonstrated by plotting the PE values for the representative testing set evaluated versus the training set values as shown in Figure 3. There is clearly

(10)

RESULTS AND DISCUSSION A testing set consisting of 699 randomly selected molecules from the full data set was used to train an ANN as described above. The remaining 92 molecules in the testing set were used to assess the predictive capability of the ANN. In order to test the robustness of the training procedure, the ANN was fit 100 times with the same training and testing sets. This resulted in a MUPE in the training set of 13% with a standard deviation of 3.2%. The MUPE for the testing set was 59% with a standard deviation of 4.1%. In order to demonstrate the robustness of the model with respect to different data sets, three randomly generated data sets were tested 100 times each as described above. The results for the training and testing sets for all three sets were similar, indicating that the results were not a consequence of a fortunate selection of testing set. In order to demonstrate the operation of the ANN model presented here, a representative ANN was selected. The full set of input and output values for the training and testing sets are given in the Supporting Information. A subset of the results is presented in this article to demonstrate the performance of the method. The representative ANN was fit in 6000 epochs with six bit failures. The MSE of the fit was 2.9 × 10−4. The MUPE of the training set was 12%, with extreme values of the percent error of −61 and 175%. When the testing set was evaluated using the trained ANN, the MUPE increases to 51% with extreme values of −88% (about a factor of 8 error versus the training value) and 297% (about a factor of 4 versus the training value). A histogram plot of the percent error (PE) values for the training set is presented in Figure 2. This plot shows that the set of rate constants in the training set is fit quite well. Examination of the PE values reveals that 70% of the values are fit with a PE of ±10% or less and 97% of the values are fit with a PE of ±67% or less. The largest PE values in the set (−67%, +175%) correspond to an error less than a factor of 3. Thus, it

Figure 3. Histogram plot of the mean unsigned percent error of the results predicted by the ANN for the testing set.

significantly more error in these results versus the training set. However, 75% of the predictions fell within a PE of ±67%. Thus, rate constants are reasonably well predicted by the ANN method, though there is room for improvement. A representative set of 18 compounds is presented in Table 1. In this table, the training and predicted rate constants are given along with the percent error of the values. The values in this table came from the training set and illustrate the quality of the fitting of the ANN model. It is desirable to test the ANN model more rigorously as well as to compare it to to another estimation scheme. This is done by comparing the rate constants predicted by the ANN model and the rate constants predicted by the AOPWIN software package74 to the values published by the NASA Panel for Data Evaluation.75 The latter values are obtained via extensive expert F

DOI: 10.1021/acs.jpcb.5b09558 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

Table 1. Predictions of the Rate Constant at 298 K Made by the ANN Compared to Training Values for 29 Compoundsa name

formula

CAS

kpred

ktrain

error

HFC-152a methyl HFC-32 HFC-134 HFC-152 HFC-161 HFC-236cb HFC-236ea HFC-245ca HFC-245ea HFC-245fa HFC-263fb HFC-356mcf HFC-356mff HFC-356mfc HFE-143a HFE-245fa HFE-356mff2

CH3CHF2 chloroform CH2F2 CHF2CHF2 CH2fCH2F CH3CH2F CF3CF2CH2F CF3CHFCHF2 CH2FCF2CHF2 CH2FCF2CHF2 CHF2CH2CF3 CF3CH2CH3 CF3CF2CH2CH2F CF3CH2CH2CF3 CF3CH2CF2CH3 CF3OCH3 CHF2OCH2CF3 CF3CH2OCH2CF3

75-37-6 71-55-6 75-10-5 359-35-3 624-72-6 353-36-6 677-56-5 431-63-0 679-86-7 24270-66-4 460-73-1 421-07-8 161791-33-9 407-59-0 406-58-6 421-14-7 1885-48-9 333-36-8

2.52(−12) 9.52(−13) 1.88(−11) 4.71(−11) 7.99(−12) 5.73(−12) 6.27(−13) 1.56(−12) 1.09(−11) 1.40(−11) 4.15(−12) 7.69(−14) 3.73(−11) 4.83(−11) 4.93(−11) 1.04(−14) 1.88(−11) 1.98(−14)

2.30(−12) 9.53(−13) 1.95(−11) 4.63(−11) 7.79(−12) 5.65(−12) 6.16(−13) 1.61(−12) 1.15(−11) 1.21(−11) 4.06(−12) 7.56(−14) 2.41(−11) 5.70(−11) 4.85(−11) 1.27(−14) 2.00(−11) 2.52(−14)

9% 0% −3% 1% 2% 1% 1% −3% −5% 15% 2% 1% 54% −15% 1% −18% −5% −21%

mean unsigned percent error a

8.7%

Rate constants are in units of cm3 molecule−1 s−1. Values in parentheses represent powers of 10.

Table 2. Comparison of Rate Constants (cm3 molecule−1 s−1) at 298 K Predicted by the Artificial Neural Network Model and by AOPWIN Software74 that Implements the Atkinson Method22,31,32 versus High-Quality Review Values Taken from the NASA JPL Compilation75 name

CAS number

kJPL

kpred

error

kAOP

error

ethanol HCFC-132b HFC-152a HCFC-225cb HFC-245eb HCFC-123 HCFC-31 HFC-365mfc HCFC-225ca 2-bromo-1,1,1-trifluoroethane

115-10-6 1649-08-7 75-37-6 507-55-1 460-73-1 306-83-2 593-70-4 406-58-6 422-56-0 421-06-7

3.35(−12) 1.70(−14) 3.30(−14) 8.90(−15) 1.70(−14) 3.60(−14) 4.10(−14) 6.90(−15) 2.50(−14) 1.60(−14)

2.96(−12) 1.62(−14) 1.59(−14) 8.83(−15) 5.82(−15) 2.90(−14) 4.38(−14) 1.52(−14) 2.66(−14) 1.08(−14)

−11% −4% −51% 0% −65% −19% 7% 120% 6% −32%

1.66(−12) 1.10(−14) 3.48(−14) 1.25(−15) 6.80(−15) 3.00(−14) 3.30(−14) 8.70(−15) 5.04(−15) 1.86(−14)

67% 42% −5% 150% 85% 18% 21% −23% 132% −15%

mean unsigned percent error

26%

50%

is around 1000, and a significant number of these are of unknown quality. If one relies only on well-established rate constant values, then the number of compounds available drops to perhaps 500. This points out a weakness in the current model. When the diversity of the chemical bonding in the training set is increased, the number of input parameters can grow rapidly to the point where the number of input parameters is a significant fraction of the number of data points, leading to concerns of overfitting. (The model of Atkinson32 also uses a large number of parameters, indicating the complexity of the problem.) If the model is used on a family of molecules, then the fitting can be quite good with a minimal number of parameters. The limited size of the training set affects the quality of the ANN in (at least) two ways. First, the diversity of chemical features (e.g., types of chemical bonding, the presence of cyclic and aromatic structures, bonding patterns, and neighbor effects) cannot be encompassed by the relatively small number of compounds used in training. To overcome this problem, new experimental or computational determinations of rate constants

review and generally constitute the best data available for a particular rate constant. The AOPWIN software package implements the method of Atkinson.22,31,32 The values use to train the ANN model or to fit the Atkinson method may differ from the NASA Panel values, so this comparison provides a critical evaluation of the two methods. The results are presented in Table 2 for 10 compounds. It is clearly seen that both estimation schemes do reasonably well and that the results of the present ANN model are in better agreement with the NASA panel values than those estimated by the Atkinson scheme. In fact, the ANN model results have a MUPE of 26%, whereas the results of the Atkinson scheme have a MUPE of 50%. However, the small number of results presented suggests that these results should not be taken as an indication of the true performance of either method. Given the diversity of chemical bonding in compounds that might reasonably be tested and the rather small data set that can be used for fitting, one can start to see limitations in the present approach. First, the number of •OH reaction rate constants available in the literature for training an ANN model G

DOI: 10.1021/acs.jpcb.5b09558 J. Phys. Chem. B XXXX, XXX, XXX−XXX

The Journal of Physical Chemistry B



are needed. Second, the small size of the training set may lead to biased results, especially when the uncertainty on the input rate constant is large or the value is in error. There are additional sources of error in the present approach that lead to difficulties in the training of the ANN, to larger measures of the error (MSE), and to reduced predictive capability. One obvious source of error is the uncertainty in the • OH reaction rate constants. Inflexibility in the data model (e.g., the inability to distinguish isomers) similarly can have a negative impact on the final result. Future refinements to the present work include more careful screening (or removal) of rate constants, incorporation of additional theoretical values, the addition of new values (from experiment or theory), and an improved data model. For example, descriptors used in other QSAR (or similar) models and other indices based on cheminformatics methods might be used. (However, those that have been tested so far, such as the Wiener index76 and other similar indices that have been successfully used elsewhere, produce no significant improvement using the present model.) An additional improvement to the current model would be to eliminate or mitigate the bias introduced by the use of the logarithmic function. The approach used by Atkinson32 computed the final rate constant as the sum of four fitted expressions. Adopting a similar approach would result in the creation of four new ANN models to compute the rate constant. Each of these models would presumably have reduced complexity and increased accuracy for a particular chemistry. Approaches based on transition state theory were also explored, but they were not found to yield a significant improvement to the model. The use of descriptors from quantum chemistry calculations was not considered in this study as the goal was to minimize the amount of time and effort needed to apply the method. In a future study, the use of inputs derived from quantum chemistry calculations may be considered. Such refinements may lead to an ANN with improved predictive power and broader applicability.

Article

ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jpcb.5b09558. List of internal coordinates used in the model and tables of the rate constants used to train the ANN and predicted by the ANN (PDF)



AUTHOR INFORMATION

Corresponding Author

*Phone: 301-975-2216. E-mail: [email protected]. Notes

The author declares no competing financial interest.



ACKNOWLEDGMENTS The author is grateful to Dr. Charles Bevington of the United States Environmental Protection Agency for providing rate constants from the AOPWIN software package.



REFERENCES

(1) IPCC, 2007: Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change; Solomon, S., Qin, D., Manning, M., Chen, Z., Marquis, M., Averyt, K. B., Tignor, M., Miller, H. L., Eds.; Cambridge University Press: Cambridge, 2007. (2) Hodnebrog, O.; Etminan, M.; Fuglestvedt, J. S.; Marston, G.; Myhre, G.; Nielsen, C. J.; Shine, K. P.; Wallington, T. J. Global Warming Potentials and Radiative Efficiencies of Halocarbons and Related Compounds: A Comprehensive Review. Rev. Geophys. 2013, 51, 300−378. (3) Elrod, M. J. Greenhouse Warming Potentials from the Infrared Spectroscopy of Atmospheric Gases. J. Chem. Educ. 1999, 76, 1702− 1705. (4) Papasavva, S.; Tai, S.; Esslinger, A.; Illinger, K. H.; Kenny, J. E. Ab Initio Calculations of Vibrational Frequencies and Infrared Intensities for Global Warming Potential of CFC Substitutes: CF3CH2F (HFC134a). J. Phys. Chem. 1995, 99, 3438−3443. (5) Blowers, P.; Hollingshead, K. Estimations of Global Warming Potentials from Computational Chemistry Calculations for CH2F2 and Other Fluorinated Methyl Species Verified by Comparison to Experiment. J. Phys. Chem. A 2009, 113, 5942−5950. (6) Evans, M. G.; Polanyi, M. Equilibrium Constants and Velocity Constants. Nature (London, U. K.) 1936, 137, 530−531. (7) Gaffney, J. S.; Levine, S. Z. Predicting Gas-Phase Organic Molecule Reaction Rates Using Linear Free-Energy Correlations. O(3P) and OH Addition and Abstraction Reactions. Int. J. Chem. Kinet. 1979, 11, 1197−1209. (8) Grosjean, D.; Williams, E. L., II Environmental Persistance of Organic Compounds Estimated from Structure-Reactivity and Linear Free-Energy Relationships−Unsaturated Aliphatics. Atmos. Environ., Part A 1992, 26, 1395−1405. (9) Khan, S. S.; Zhang, Q.; Broadbelt, L. J. Automated Mechanism Generation. Part I. Mechanism Development and Rate Constants Estimation for VOC Chemistry in the Atmosphere. J. Atmos. Chem. 2009, 63, 125−156. (10) Güsten, H.; Filby, W. G.; Schoof, S. Prediction of Hydroxyl Radical Reaction Rates with Organic Compounds. Atmos. Environ. 1981, 15, 1763−1765. (11) Dilling, W. L.; Gonsior, S. J.; Boggs, G. U.; Mendoza, C. G. Organic Photochemistry 20: A Method for Estimating Gas-Phase Rate Constants for Reactions of Hydroxyl Radicals with Organic Compounds from Their Relative Rates of Reaction with Hydrogen Peroxide Under Photolysis in 1,1,2-Trichlorotrifluoroethane Solution. Environ. Sci. Technol. 1988, 22, 1447−1453.



CONCLUSIONS In this article, the construction of an ANN for the prediction of • OH reaction rate constants at 298 K has been described. The data model is based on parameters that are based solely on the chemical bonding (i.e., bonds and bends) and the spin multiplicity. The ANN was trained using a set of 791 data taken from the NIST Chemical Kinetics Database. Evaluation of the performance of the ANN was accomplished via construction of a number of ANNs trained using a subset of the full data set, with the remainder used for evaluation. The MUPE was used as the primary error metric. The MUPE produced in these trials was 12%. Training using 90% of the data set and testing against the remaining 10% of the data shows that the MUPE increases by about a factor of 4 when the ANN is used to predict rate constants. In general, the performance of the ANN methodology is encouraging, but it is desirable to eliminate some of the larger errors observed both in training and in testing. With further refinement, the ANN approach demonstrated in this article should be suitable for predicting and screening rate constants for reactions of small molecules with •OH radical. The ability to rapidly predict rate constants of this type has applications in evaluating the global warming potential. H

DOI: 10.1021/acs.jpcb.5b09558 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

Radicals with Organic Compounds. Int. J. Chem. Kinet. 1987, 19, 799− 828. (32) Kwok, E. S. C.; Atkinson, R. Estimation of Hydroxyl Radical Reaction Rate Constants for Gas-Phase Organic Compounds Using a Structure-Reactivity Relationship: An Update. Atmos. Environ. 1995, 29, 1685−1695. (33) Klamt, A. Estimation of Gas-Phase Hydroxyl Radical Rate Constants of Organic Compounds from Molecular Orbital Calculations. Chemosphere 1993, 26, 1273−1289. (34) Klamt, A. Estimation of Gas-Phase Hydroxyl Radical Rate Constants of Oxygenated Compounds Based on Molecular Orbital Calculations. Chemosphere 1996, 32, 717−726. (35) Böhnhardt, A.; Kühne, R.; Ebert, R.-U.; Schüürmann, G. Indirect Photolysis of Organic Compounds: Prediction of OH Reaction Rate Constants Through Molecular Orbital Calculations. J. Phys. Chem. A 2008, 112, 11391−11399. (36) Böhnhardt, A.; Kühne, R.; Ebert, R.-U.; Schüürmann, G. Predicting Rate Constants of OH Radical Reactions with Organic Substances: Advances for Oxygenated Organics Through a Molecular Orbital HF/6-31G** Approach. Theor. Chem. Acc. 2010, 127, 355− 367. (37) Atkinson, R. Estimation of Gas-Phase Hydroxyl Radical Rate Constants for Organic Chemicals. Environ. Toxicol. Chem. 1988, 7, 435−442. (38) Wallington, T. J.; Dagaut, P.; Liu, R.; Kurylo, M. J. Rate Constants for the Gas Phase Reactions of OH with C5 Through C7 Aliphatic Alcohols and Ethers: Predicted and Experimental Values. Int. J. Chem. Kinet. 1988, 20, 541−547. (39) Tosato, M. L.; Chiorboli, C.; Eriksson, L.; Jonsson, J. Multivariate Modeling of the Rate Constants of the Gas-Phase Reaction of Haloalkanes with the Hydroxyl Radical. Sci. Total Environ. 1991, 109-110, 307. (40) Meylan, W. M.; Howard, P. H. Computer Estimation of the Atmospheric Gas-Phase Reaction Rate of Organic Compounds with Hydroxyl Radicals and Ozone. Chemosphere 1993, 26, 2293−2299. (41) Güsten, H. Predicting the Abiotic Degradability of Organic Pollutants in the Troposphere. Chemosphere 1999, 38, 1361−1370. (42) Gramatica, P.; Consonni, V.; Todeschini, R. QSAR Study on the Tropospheric Degradation of Organic Compounds. Chemosphere 1999, 38, 1371−1378. (43) Neeb, P. Structure-Reactivity Based Estimated of the Rate Constants for Hydroxyl Radical Reactions with Hydrocarbons. J. Atmos. Chem. 2000, 35, 295−315. (44) Alvarez-Idaboy, J. R.; Cruz-Torres, A.; Galano, A.; Ruiz-Santoyo, M. S. Structure-Reactivity Realtionshi in Ketones + OH Reactions: A Quantum Mechanical and TST Approach. J. Phys. Chem. A 2004, 108, 2740−2749. (45) Gramatica, P.; Pilutti, P.; Papa, E. A. Validated QSAR Prediction of OH Tropospheric Degradation of VOCs: Splitting into TrainingTest Sets and Consensus Modeling. J. Chem. Inf. Model. 2004, 44, 1794−1802. (46) Ö berg, T. A. QSAR for the Hydroxyl Radical Reaction Rate Constant: Validation, Domain of Application, and Prediction. Atmos. Environ. 2005, 39, 2189−2200. (47) Wilson, E. W., Jr.; Hamilton, W. A.; Kennington, H. R.; Evans, B., III; Scott, N. W.; DeMore, W. B. Measurement and Estimation of Rate Constants for the Reactions of Hydroxyl Radical with Several Alkanes and Cycloalkanes. J. Phys. Chem. A 2006, 110, 3593−3604. (48) Long, X.; Niu, J. Estimation of Gas-Phase Reaction Rate Constants of Alkylnaphthalenes with Chlorine, Hydroxyl and Nitrate Radicals. Chemosphere 2007, 67, 2028−2034. (49) Wang, Y.; Chen, J.; Li, X.; Wang, B.; Cai, X.; Huang, L. Predicting Rate Constants of Hydroxyl Radical Reactions with Organic Pollutants: Algorithm, Validation, Applicability Domain, and Mechanistic Interpretation. Atmos. Environ. 2009, 43, 1131−1135. (50) Huang, X.; Yu, X.; Yi, B.; Zhang, S. Prediction of Rate Constants for the Reactions of Alkanes with the Hydroxyl Radicals. J. Atmos. Chem. 2012, 69, 201−213.

(12) Ohta, T. Rate Constants for the Reactions of Diolefins with OH Radicals in the Gas Phase. Estimate of the Rate Constants from Those for Monoolefins. J. Phys. Chem. 1983, 87, 1209−1213. (13) Hodson, J. The Estimation of the Photodegradation of Organic Compounds by Hydroxyl Radical Reaction Rate Constants Obtained from Nulcear Magnetic Resonance Spectroscopy Chemical Shift Data. Chemosphere 1988, 17, 2339−2348. (14) Heicklen, J. The Correlation of Rate Coefficients for H-Atom Abstraction by HO Radicals with C−H Bond Dissociation Enthalpies. Int. J. Chem. Kinet. 1981, 13, 651−665. (15) Jolly, G. S.; Paraskevopoulos, G.; Singleton, D. L. Rates of OH Radical Reactions. XII. The Reactions of OH with C-C3H6, C-C5H10, and C-C7H14. Correlation of Hydroxyl Rate Constants with Bond Dissociation Energies. Int. J. Chem. Kinet. 1985, 17, 1−10. (16) Chandra, A. K.; Uchimaru, T. A DFT Study on the C−H Bond Dissociation Enthalpies of Haloalkanes: Correlation Between the Bond Dissociation Enthalpies and Activation Energies for Hydrogen Abstraction. J. Phys. Chem. A 2000, 104, 9244−9249. (17) Chandra, A. K.; Uchimaru, T. The C−H Bond Dissociation Enthalpies of Haloethers and Its Correlation with the Activation Energies for Hydrogen Abstraction by OH Radical: A DFT Study. Chem. Phys. Lett. 2001, 334, 200−206. (18) Chandra, A. K.; Uchimaru, T.; Urata, S.; Sugie, M.; Sekiya, A. Estimation of Rate Constants for Hydrogen Atom Abstraction by OH Radicals Using the C−H Bond Dissociation Enthalpies: Haloalkanes and Haloethers. Int. J. Chem. Kinet. 2003, 35, 130−138. (19) Urata, S.; Takada, A.; Uchimaru, T.; Chandra, A. K.; Sekiya, A. Artificial Neural Network Study for the Estimation of the C−H Bond Dissociation Energies. J. Fluorine Chem. 2002, 116, 163−171. (20) Urata, S.; Takada, A.; Uchimaru, T.; Chandra, A. K. Rate Constants Estimation for the Reaction of Hydrofluorocarbons and Hydrofluoroethers with OH Radicals. Chem. Phys. Lett. 2003, 368, 215−233. (21) Güsten, H.; Klasinc, L.; Marić, D. Prediction of the Abiotic Degradability of Organic Compounds in the Troposphere. J. Atmos. Chem. 1984, 2, 83−93. (22) Atkinson, R. Kinetics and Mechanisms of the Gas-Phase Reactions of the Hydroxyl Radical with Organic Compounds Under Atmospheric Conditions. Chem. Rev. 1986, 86, 69−201. (23) Cohen, N.; Benson, S. W. Empirical Correlations for Rate Coefficients for Reactions of OH with Haloalkanes. J. Phys. Chem. 1987, 91, 171−175. (24) Bartolotti, L. J.; Edney, E. O. Investigation of the Correlation Between the Energy of the Highest Occupied Molecular Orbital (HOMO) and the Logarithm of the OH Rate Constant of Hydrofluorocarbons and Hydrofluoroethers. Int. J. Chem. Kinet. 1994, 26, 913−920. (25) Percival, C. J.; Marston, G.; Wayne, R. P. Correlations Between Rate Parameters and Calculated Molecular Properties in the Reactions of the Hydroxyl Radical with Hydrofluorocarbons. Atmos. Environ. 1995, 29, 305−311. (26) DeMore, W. B. Regularities in Arrhenius Parameters for Rate Constants of Abstraction Reactions of Hydroxyl Radical with C-H Bonds. J. Photochem. Photobiol., A 2005, 176, 129−135. (27) Benson, S. W.; Buss, J. H. Additivity Rules for the Estimation of Molecular Properties. Thermodynamic Properties. J. Chem. Phys. 1958, 29, 546−572. (28) DeMore, W. B. Experimental and Estimated Rate Constants for the Reactions of Hydroxyl Radicals with Several Halocarbons. J. Phys. Chem. 1996, 100, 5813−5820. (29) Sumathi, R.; Carstensen, H.-H.; Green, W. H., Jr. Reaction Rate Prediction Via Group Additivity Part 1: H Abstraction from Alkanes by H and CH3. J. Phys. Chem. A 2001, 105, 6910−6925. (30) Kazakov, A.; McLinden, M. O.; Frenkel, M. Computational Design of New Refrigerant Fluids Based on Environmental, Safety, and Thermodynamic Characteristics. Ind. Eng. Chem. Res. 2012, 51, 12537−12548. (31) Atkinson, R. A Structure-Activity Relationship for the Estimation of Rate Constants for the Gas-Phase Reactions of OH I

DOI: 10.1021/acs.jpcb.5b09558 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

(73) Riedmiller, M.; Braun, H. A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm 1993, 586−591. (74) The Atmospheric Oxidation Program for Microsoft Windows (AOPWIN), v1.92a; U.S. Environmental Protection Agency: Washington, DC, 2010. (75) Sander, S. P.; Friedl, R. R.; Barker, J. R.; Golden, D. M.; Kurylo, M. J.; Wine, P. H.; Abbatt, J. P. D.; Burkholder, J. B.; Kolb, C. E.; Moortgat, G. K. et al. Chemical Kinetics and Photochemical Data for Use in Atmospheric Studies. Evaluation Number 17. JPL Publication 10-6; Jet Propulsion Laboratory: Pasadena, CA, 2011. (76) Wiener, H. Structural Determination of Paraffin Boiling Points. J. Am. Chem. Soc. 1947, 69, 17−20.

(51) Poutsma, M. L. Evolution of Structure-Reactivity Correlations for the Hydrogen Abstraction Reaction by Chlorine Atom. J. Phys. Chem. A 2013, 117, 687−703. (52) Poutsma, M. L. Evolution of Structure-Reactivity Correlations for the Hydrogen Abstraction Reaction by Hydroxyl Radical and Comparison with That by Chlorine Atom. J. Phys. Chem. A 2013, 117, 6433−6449. (53) Leroy, G.; Sana, M. A. A Semi-Empirical Method for the Estimation of Activation Barriers and Kinetic Parameters of Atom Transfer Reactions. J. Mol. Struct.: THEOCHEM 1986, 136, 283−301. (54) Cohen, N.; Benson, S. W. Transition-State-Theory Calculations for Reactions of OH with Haloalkanes. J. Phys. Chem. 1987, 91, 162− 170. (55) Cohen, N. The Use of Transition-State Theory to Extrapolate Rate Coefficients for Reactions of OH with Alkane Reactions. Int. J. Chem. Kinet. 1982, 14, 1339−1362. (56) Cohen, N. Are Reaction Rate Coefficients Additive? Revised Transition State Theory Calculations for OH + Alkane Reactions. Int. J. Chem. Kinet. 1991, 23, 397−417. (57) Xing, S.-B.; Shi, S.-H.; Qui, L.-X. Kinetics Studies of Reactions of OH Radicals with Four Haloethanes. Part I. Experiment and BEBO Calculations. Int. J. Chem. Kinet. 1992, 24, 1−10. (58) Donahue, N. M.; Clarke, J. S.; Anderson, J. G. Predicting Radical-Molecule Barrier Heights: The Role of the Ionic Surface. J. Phys. Chem. A 1998, 102, 3923−3933. (59) Huynh, L. K.; Ratkiewicz, A.; Truong, T. N. Kinetics of the Hydrogen Abstraction Reaction OH + Alkane → H2O + Alkyl Reaction Class: An Application of the Reaction Class Transition State Theory. J. Phys. Chem. A 2006, 110, 473−484. (60) Garrett, B. C.; Truhlar, D. G.; Grev, R. S.; Magnuson, A. W. Improved Treatment of Threshold Contributions in Variational Transition-State Theory. J. Phys. Chem. 1980, 84, 1730−1748. (61) Lu, D.; Truong, T. N.; Melissas, V. S.; Lynch, G. C.; Liu, Y.-P.; Garrett, B. C.; Steckler, R.; Isaacson, A. D.; Rai, S. N.; Hancock, G. C.; et al. POLYRATE 4: A New Version of a Computer Program for the Calculation of Chemical Reaction Rates for Polyatomics. Comput. Phys. Commun. 1992, 71, 235−262. (62) Liu, Y.-P.; Lynch, G. C.; Truong, T. N.; Lu, D. H.; Truhlar, D. G.; Garrett, B. C. Molecular Modeling of the Kinetic Isotope Effect for the [1,5]-Sigmatropic Rearrangement of Cis-1,3-Pentadiene. J. Am. Chem. Soc. 1993, 115, 2408−2415. (63) Haykin, S. Neural Networks: A Comprehensive Foundation; Macmillan College Publishing Company: New York, 1994. (64) Cybenko, G. Approximations by Superpositions of Sigmoidal Functions. Math. Control Signals Syst. 1989, 2, 303−314. (65) Manion, J. A.; Huie, R. E.; Levin, R. D.; Burgess, D., Jr.; Orkin, V. L.; Tsang, W.; McGivern, W. S.; Hudgens, J. W.; Knyazev, V. D.; Atkinson, D. B. et al. NIST Chemical Kinetics Database, NIST Standard Reference Database 17, version 7.0 (Web Version), release 1.6.7, data version 2013.03; National Institute of Standards and Technology: Gaithersburg, MD, 2013. (66) NIST Chemical Kinetics Database, 2013; http://kinetics.nist.gov/ (accessed: December 1, 2015). (67) Linstrom, P. J.; Mallard, W. G. NIST Chemistry WebBook, NIST Standard Reference Database Number 69; National Institute of Standards and Technology: Gaithersburg, MD. http://webbook.nist. gov/ (accessed: December 1, 2015). (68) O’Boyle, N. M.; Banck, M.; James, C. A.; Morley, C.; Vandermeersch, T.; Hutchison, G. R. Open Babel: An Open Chemical Toolbox. J. Cheminf. 2011, 3, 33. (69) The Open Babel Package, version 2.3.1, 2013; http://openbabel. org/ (accessed: December 1, 2015). (70) Nissen, S. Implementation of a Fast Artificial Neural Network Library (FANN), 2003. (71) Elliott, D. L. A. Better Activation Function for Artificial Neural Networks, 1993. (72) Nguyen, D.; Widrow, B. Improving the Learning Speed of 2-Layer Neural Networks by Choosing Initial Values of the Adaptive Weights 1990, 3, 21−26. J

DOI: 10.1021/acs.jpcb.5b09558 J. Phys. Chem. B XXXX, XXX, XXX−XXX