Machine Learnt Coarse-Grained Models - The Journal of Physical

Jul 19, 2018 - User Resources. About Us · ACS Members · Librarians · Authors & Reviewers · Website Demos · Privacy Policy · Mobile Site ...
0 downloads 0 Views 977KB Size
Subscriber access provided by Kaohsiung Medical University

Chemical and Dynamical Processes in Solution; Polymers, Glasses, and Soft Matter

Machine Learnt Coarse-Grained Models Karteek K. Bejagam, Samrendra Kumar Singh, Yaxin An, and Sanket A Deshmukh J. Phys. Chem. Lett., Just Accepted Manuscript • DOI: 10.1021/acs.jpclett.8b01416 • Publication Date (Web): 19 Jul 2018 Downloaded from http://pubs.acs.org on July 25, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

Machine Learnt Coarse-Grained Models Karteek K. Bejagama#, Samrendra Singhb#, Yaxin Ana, Sanket A. Deshmukha* a

Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States b

CNH Industrial, Burr Ridge, Illinois 60527, United States

AUTHOR INFORMATION Corresponding Author * [email protected], Phone: +1 540-231-8785 Abstract Optimizing force-field (FF) parameters to perform molecular dynamics (MD) simulations is a challenging and time-consuming process. Here, we present a novel force-field (FF) optimization framework that integrates MD simulations with particle swarm optimization (PSO) algorithm and artificial neural network (ANN). This new ANN-assisted-PSO framework was used to develop transferable coarse-grained (CG) models for D2O and DMF, as a proof of concept. The PSO algorithm was used to generate the set of input FF parameters for the MD simulations of the CG models of these solvents, which were optimized to reproduce their experimental properties. Herein, for the first time, a reverse approach was employed for on-the-fly training of the ANN model where results (solvent properties) obtained from the MD simulations and their corresponding FF parameters were used as inputs and outputs, respectively. The ANN model was then required to predict a set of new FF parameters, which were tested for their ability to predict the desired

ACS Paragon Plus Environment

1

The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 15

experimental properties. This new framework can be extended to integrate any optimization algorithm with ANN and MD simulations to accelerate the FF development.

TOC GRAPHICS

The optimization of force-field (FF) parameters used in molecular dynamics (MD) simulations to reproduce experimental properties of a given system is undoubtedly the most tedious and time consuming process. Traditionally, the development of FF parameter set is carried out manually via trial-and-error approach, which is very expensive, inefficient, and a laborintensive task.1,2 In recent years, due to the increase in computing power, different optimization algorithms including genetic algorithm, particle swarm optimization (PSO), simplex method, and gradient descent have been used to develop FF parameter sets.3–7 These methods have resulted in the optimized parameters, in a reasonable computing time, that can reproduce several experimental properties of a given system with a good accuracy.3–6 For example, PSO is a population based metaheuristic global optimization technique, which works iteratively to guide the particles closer to the best solution with each iteration.8 During the FF optimization performed using PSO, each particle refers to a set of FF parameters, which is used to perform an MD simulation. In each iteration, the fitness value of a particle is determined by comparing the results of MD simulations with experimental target properties. In PSO, the personal best (pbest) of a particle is assigned to the location of an individual particle’s highest fitness value. The particle with the least error (higher fitness) in the entire optimization cycle is assigned as the global best (gbest). PSO adjusts the position of particles (i.e. modifies the FF parameter sets) so that the particles converge to a global best in steps with a certain velocity. Typically, the optimization is

ACS Paragon Plus Environment

2

Page 3 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

performed till the error decreases below a desired value (~2 to 5 %) (see Section 1 of the Supporting Information for more details on the PSO method). A variety of sensitivity analysis, uncertainty quantifications, and machine-learning (ML) methods have also been utilized to develop FF parameters and to predict the material properties for a range of different applications.9–16 For example, by using data available in the literature, ML models have been successfully trained to predict the phase diagrams, crystal structures, glass transition temperature, dielectric properties of polymers, to name a few.17–23 Successful implementation of ML methods such as artificial neural networks (ANN), Gaussian processes, and other algorithms in the development of accurate FF parameters for MD simulations is of relevance to the current manuscript.24–31 In addition, the ML models with adaptive on-the-fly learning scheme to accelerate ab initio MD simulations have also been utilized.32 ANN is one of the most widely used supervised ML method used to build predictive models for different applications including properties of composite materials, accessing high-dimensional free energy landscape, refining atomistic force field potentials, etc.33–35 Its underlying design is inspired by the network of interconnected neurons in the brain and its ability to perform logically complex tasks. Given a set of input variables and output targets, an ANN can be trained to understand the relationship between the input and output set.28,36 A simple ANN consists of an input layer, an output layer, and possibly a few hidden layers between input and output layers. In the present study, the number of input nodes were set to the number of target properties that we needed to optimize for a given CG model. The number of nodes in the output layer were set to the total number of quantities to be optimized in a given FF parameter set. More details on the ANN model are discussed in Section 1 of the Supporting Information. To the best of our knowledge, the present contribution is the first attempt to train the ANN model on-the-fly in a reverse fashion to accelerate the FF development for CG models. Specifically, the properties predicted by the MD simulations and their corresponding FF parameters generated by the PSO algorithm were utilized as inputs and outputs for the ANN model, respectively. The ANN model was then required to predict a set of new FF parameters, which should result in the desired experimental properties. These new parameters predicted by the ANN model were tested along with the MD input parameters generated by the PSO algorithm during the FF optimization process.

ACS Paragon Plus Environment

3

The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 15

During the optimization of FF performed using only the PSO algorithm, the progress made by a particle (here, a particle refers to a set of FF parameters) during the optimization run is entirely based on the following three factors: (i) its current state, (ii) its individual best and (iii) the global best of the swarm.8 Thus, the large input and output dataset generated along the path of a particle, during the FF optimization, in a PSO run is forgotten and/or remain unutilized. Here, we utilize the entire or partial MD simulations’ input and output data generated during a PSO optimization cycle to train, on-the-fly, an ANN based ML model. Then we further used the predictions from the ANN model as inputs for the MD simulations in the next optimization cycle (Figure 1). We refer this integration of the PSO method with an ANN model as the ANN-assistedPSO method. The predictions from the ANN model obtained based on this training were used to supplement one or more additional particles in the PSO swarm. In principle, if the particle using the ANN predicted parameters, results in better properties than the PSO particles then the ANN model has essentially guided the swarm. Here, the ANN model was trained by using top 20 % data or 100 datasets, whichever is greater, generated during and after the first cycle of the PSO, irrespective of the total number of particles used in the PSO swarm.

Figure 1: Scheme for the ANN-assisted-PSO method. Detailed description of the integration of ANN with PSO and MD simulations is given in Figure S3 of the Supporting Information.

To demonstrate the efficiency and effectiveness of this new ANN-assisted-PSO method and to compare its performance with the FF parameters developed by using only the PSO algorithm, we have developed new CG models of D2O and DMF. The mapping schemes used to

ACS Paragon Plus Environment

4

Page 5 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

describe these CG models are shown in Figure 2 a-f. For both D2O and DMF, two types of models were developed: (i) a CG model where a polarizable functional group in a molecule was represented by a core bead and a dummy bead with +q and -q partial charges, respectively (referred as a polarizable model), and (ii) a CG model where a polarizable functional group in a molecule was represented by a chargeless bead (referred as a non-polarizable model). These models were chosen such that they possess different number of beads and thereby different number of parameters to compare both the optimization approaches. More details on the mapping scheme of these models and the FF functional form are discussed in Section 2 of the Supporting Information. The experimental values of density, self-diffusion coefficient, and dielectric constant (only for the polarizable model) were used as the target properties to tune the set of FF parameters at 300 K for both D2O and DMF. The FF optimization was carried out by utilizing 4, 8, and 40 PSO particles for all four CG models and each optimization was repeated three times for a rigorous comparison between the two approaches. Note, for developing FF parameters for a CG or an all-atom model with any of these approaches (only PSO or ANN-assisted-PSO), multiple trials are not necessary, if the error between the target properties and predicted properties is within the acceptable tolerance set by an individual.

ACS Paragon Plus Environment

5

The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 15

Figure 2: Mapping schemes adopted in the development of the CG models of D2O (a) and DMF (d). Non-polarizable CG models of D2O (b), and DMF (e). Polarizable models of D2O (c), and DMF (f). Green and red beads represent core and virtual sites in the polarizable models, respectively.

For the development of D2O non-polarizable and polarizable models, the number of parameters that needed to be determined were two and five, respectively. During the optimization, MD simulations of 500 CG D2O molecules were performed at 300 K with a time-step of 15 fs. Performance of both the only PSO and ANN-assisted-PSO methods obtained by averaging the data for three independent optimization runs is shown in Figure 3 a-f. The data for individual runs and the detailed analysis of the performance of ANN-assisted-PSO approach can be found in Figures S4 and S5, and Section 3.1 of the Supporting Information. This analysis shows that during an optimization run, on several occasions (as many as 75 iterations of total 80 iterations), we found that the ANN particle had the lowest error and it drives all other particles to the optimal solution. In the case of non-polarizable D2O model, the ANN-assisted-PSO approach showed improvement in optimizing the parameters as compared to the only PSO run, which is evident from the faster decrease in the error. For 4 particles, the use of ANN-assisted-PSO method showed better performance as compared to the use of only PSO. For polarizable D2O model, we had five parameters to be optimized to form a complete FF set. With an increase in the number of quantities, ANN-assisted-PSO method showed significant improvements for 8 and 4 particle runs as compared to the PSO only method. To further test the convergence of the error associated with ANN particle with number of training data, we randomly selected one of the D2O non-polarizable models that was developed by using 40 particles in the ANN-assisted-PSO study (see Section 3.2 of Supporting Information). We trained the ANN model by selecting 50, 100, 150, 200, and 400 datasets, which are filtered from large pool of dataset of size 1-250, 1-500, 1-750, 1-1000, and 12000, respectively. The results suggest for both epsilon (ε) and sigma (σ), with increase in the number of data points from 50 to 400 the accuracy of the ANN model increases dramatically. Final sets of FF parameters for both the CG D2O models obtained from all these optimization runs are shown in Tables S3 and S4 of the Supporting Information. Range for input parameters with optimized values are given in Table S5.

ACS Paragon Plus Environment

6

Page 7 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

Figure 3: Comparison of the performance between the PSO (black lines) and ANN-assisted-PSO method (red lines). FF parameters were optimized for D2O non-polarizable model (top panel) and D2O polarizable model (bottom panel). Number of particles used for the optimization of the parameters are 40 (a, d), 8 (b, e), and 4 (c, f). Error evaluation reported in all the graphs is the average value from the three independent optimization runs. Further, to validate the robustness and transferability of the new FF parameters, MD simulations of 10000 CG D2O molecules were performed at 300 K. Specifically, the surface tension and isothermal compressibility, which were not included as target properties during parameterization using these FF sets at 300 K were in excellent agreement with the reported experimental data (see Table S6 of the Supporting Information). All the properties from the simulation trajectories were determined by using the methods utilized in our previous work.7 Furthermore, to test the ability of these FF parameters of new CG models in predicting the properties at various temperatures, we have performed simulations at 290 K, 300 K, 310 K, and 330 K for 60 ns. Table S6 illustrates the density, surface tension, and dielectric constant of D2O non-polarizable and polarizable models calculated by analyzing these CG MD simulation trajectories. We find that with an increase in the temperature, as expected, for both the nonpolarizable and polarizable models of D2O, the density and surface tension decreases, while the self-diffusion coefficient increases. In the temperature range of 290 K to 330 K, the density calculated from simulation trajectories for D2O non-polarizable and polarizable model is within 1 % to that of experimentally reported values. The surface tension for both the D2O models shows good qualitative and quantitative agreement with the experimental data. Similarly, the values of

ACS Paragon Plus Environment

7

The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 15

the self-diffusion coefficients predicted by the new D2O non-polarizable and polarizable models in the temperature range of 290 K to 330 K were within ~25 % and ~15 % to that of experimental values. The dielectric constant for the polarizable model, suggests a decrease in the values with an increase in the temperature, which is consistent with the experimental observations.37 The radial distribution functions (RDF) between the beads of non-polarizable model with itself and between the core beads of polarizable model with itself showed good agreement with the RDF obtained from mapped all-atom trajectory (see Section 3.3 and Figure S7). Thus, the CG D2O models were able to capture both the structural features and the experimental properties. Next, to compare the efficiency of both the optimization approaches, CG models of DMF were developed. Mapping schemes of non-polarizable and polarizable DMF molecules is shown in Figure 2 d-f, which was based on the grouping of polar and non-polar functional moieties. Nonpolarizable model contains two chargeless CG beads, AM and CGD2 that interact through van der Waals interactions. As discussed earlier, in the case of polarizable model, the polar bead is represented by core and virtual sites bearing -q and +q charge, respectively. Based on the mapping schemes employed for the DMF molecule, we needed to determine five and seven variables for non-polarizable and polarizable CG models, respectively. Similar to the D2O models, performance of ANN-assisted-PSO method was evaluated by comparing the results from optimizations performed with different number of particles (N = 40, 8, and 4). For each optimization, it was repeated by performing 3 independent runs. Figures S8 and S9 from the Supporting Information show the graph for error vs number of iterations for all 3 independent optimization runs. Detailed analysis of the performance of ANN-assisted-PSO approach in these optimizations and the position of ANN particle in the field when it is not the fittest is discussed in Section 4.1 of Supporting Information. We find that the ANN particle drives the best particle as many as 2-3 times and it remains the best particle for maximum of 196 iterations of total 200 iterations. On 9 occasions, the ANN particle is the best within first 5 iterations, irrespective of the number of total particles and total number of iterations. Moreover, to identify the position of the ANN particle in the field when it is not the fittest, we have tracked the error of following particles during the optimization of DMF non-polarizable model with 40 and 4 particles as representative examples (Figure S10): (i) the particle with the least error (fittest) in the entire optimization cycle i.e. the global best (gbest)), (ii) the best PSO particle in the present iteration, (iii) the PSO particle with the maximum error in the present iteration, and (iv) the particle generated by the ANN model. We

ACS Paragon Plus Environment

8

Page 9 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

found that in the beginning of an optimization run the ANN particle is closer to the best PSO particle as compared to the worst PSO particle (with maximum error). With increase in the number of iterations the difference between the error of the ANN particle and both the PSO particles with minimum and maximum error decreases, suggesting that the swarm is converging to the optimal solution i.e. gbest. Tables S9 and S10 of the Supporting Information list the optimized values obtained for the individual optimization simulations. Range for the input parameters and final optimized values for both the DMF models are given in Table S11. Evolution of the error during one of the optimization runs for the non-polarizable DMF model with 40 particles is demonstrated in the Movie S1 of the Supporting Information. Specifically, the advancement of two (ε AM and εCGD2) out of total seven variables is depicted for both the optimization approaches. As can be seen from Figure 4 a-c, the optimization performed for non-polarizable DMF model with ANNassisted-PSO method results in faster decrease in the error for all 40, 8, and 4 particles, respectively, as compared to the only PSO method. Similar behavior can be seen in the Movie S1. In the case of DMF polarizable CG model, however, we do not observe dramatic difference for the ANN-assisted-PSO optimization because of the narrow range used for the parameter optimization. This suggests that based on the prior knowledge of the parameter search space one can accelerate the optimization process by tuning the PSO range.

Figure 4: Comparison of the performance between the PSO (black lines) and ANN-assisted-PSO models (red lines). FF parameters were optimized for DMF non-polarizable model (top panel) and

ACS Paragon Plus Environment

9

The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 15

DMF polarizable model (bottom panel). Number of particles used for the optimization of parameters are 40 (a, d), 8 (b, e) and 4 (c, f). Error evaluation reported in all the graphs is the average value from the three independent optimization runs. The decrease in error suggests that the FF parameters are optimized to get closer to the desired target properties.

Furthermore, employing the optimized FF set, MD simulations of 10000 CG DMF molecules were carried out at four different temperatures to explore their transferability (see Table S12 of the Supporting Information). Various properties of DMF including surface tension and isothermal compressibility, which were not used as target properties were determined by analyzing simulation trajectories. We found that for both the non-polarizable and polarizable models of DMF the values of density, self-diffusion coefficient, surface tension, isothermal compressibility, and dielectric constant (only for polarizable model) are in excellent qualitative and quantitative agreement with the experimentally reported values (Table S12). Similar to the D2O models, RDFs of the different pairs of the beads of both the DMF models showed good agreement with the RDFs obtained from the mapped all-atom trajectory (see Section S4.3 and Figure S11). Thus, the CG DMF models were able to capture both the structural features as well as the experimental properties. Based on the development of the aforementioned four CG models for two different solvents, we could summarize the following factors that affect the performance of the newly developed ANN-assisted-PSO framework: 1. If the range of parameters used for the PSO method is narrow then ANN may not be effective in terms of driving the PSO particles. However, on many occasions a priori knowledge of the parameter space is not available. Hence, in such cases use of ANNassisted-PSO method is recommended as it can dramatically improve the performance of the model development process. 2. Use of an ANN model during the optimization process does not have any negative impact on the performance of the PSO. In other words, even if the parameters predicted by the ANN do not result in the better properties than the PSO particle, the optimization process will still be driven by the global best particle obtained from the PSO algorithm. 3. Number of particles used in the PSO run can play a significant role, as with an increase in the number of particles the ANN model can be trained on more data.

ACS Paragon Plus Environment

10

Page 11 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

4. Computational performance of ANN-assisted-PSO framework: Generating the input parameters by using PSO method and training of the ANN model can take several seconds to 1 or 2 minutes. The two main computationally expensive tasks during these optimization runs are the MD simulations and analysis of the simulation trajectories. One can further improve the computational efficiency of this new framework by utilizing parallel computer codes to perform MD simulations and analysis of these trajectories. More details on the computational efficiency can be found in Section 5 of the Supporting Information.

In summary, we propose an entirely new framework that integrates the molecular dynamics (MD) simulations with particle swarm optimization (PSO) and artificial neural network (ANN) based machine-learning (ML) model to accelerate the development of coarse-grained (CG) models. The ANN model was trained on-the-fly in a reverse mode for the first time. Specifically, the input parameters for training the ANN model were the output properties calculated from the CG model and the output for the ANN model was the input FF parameters used for the CG MD simulations. Overall, the ANN-assisted-PSO method is faster as compared to FF optimizations performed with the only PSO method. The optimized FF parameters were able to predict several properties of both D2O and DMF with excellent accuracy at different temperatures. This new ANN-assisted-PSO optimization framework can be utilized for the development of classical as well as reactive AA, and CG models of different materials including metals, polymers, and their composites. Moreover, the foundation of this new framework is very general and can be extended to integrate any optimization algorithm (e.g. genetic algorithms, gradient descent etc.) with ANN type ML model and MD simulations to accelerate the FF development. ASSOCIATED CONTENT

ACS Paragon Plus Environment

11

The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 15

AUTHOR INFORMATION Notes The authors declare no competing financial interests. # K.K.B and S.S. contributed equally. ACKNOWLEDGMENT Authors would like to acknowledge Advanced Research Computing (ARC), Virginia Tech for providing the computational resources. S. A. D. acknowledges the Start-up Funds from Virginia Tech. Supporting Information Available: Details on the PSO algorithm, ANN model are available. More details on the CG models, predicted properties and structural features of the developed CG models. The following files are available free of charge. Details on PSO algorithm, ANN model and CG models (PDF) Movie illustrating the evolution of particles with optimization cycles (avi)

REFERENCES (1) McAliley, J. H.; Bruce, D. A. Development of Force Field Parameters for Molecular Simulation of Polylactide. J. Chem. Theory Comput. 2011, 7, 3756-3767. (2) Yesylevskyy, S. O.; Schäfer, L. V.; Sengupta, D.; Marrink, S. J. Polarizable Water Model for the Coarse-Grained MARTINI Force Field. PLoS Comput. Biol. 2010, 6, e1000810. (3) Betz, R. M.; Walker, R. C. Paramfit: Automated Optimization of Force Field Parameters for Molecular Dynamics Simulations. J. Comput. Chem. 2015, 36, 79-87. (4) Yang, S.; Cui, Z.; Qu, J. A Coarse-Grained Model for Epoxy Molding Compound. J. Phys. Chem. B 2014, 118, 1660-1669. (5) Mostaghim, S.; Hoffmann, M.; Konig, P. H.; Frauenheim, T.; Teich, J. Molecular Force Field Parametrization Using Multi-Objective Evolutionary Algorithms. In Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753); 2004, 1, 212-219. (6) Ivanov, M. V.; Talipov, M. R.; Timerghazin, Q. K. Genetic Algorithm Optimization of Point

ACS Paragon Plus Environment

12

Page 13 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

Charges in Force Field Development: Challenges and Insights. J. Phys. Chem. A 2015, 119, 1422-1434. (7) Bejagam, K. K.; Singh, S.; An, Y.; Berry, C.; Deshmukh, S. A. PSO-Assisted Development of New Transferable Coarse-Grained Water Models. J. Phys. Chem. B 2018, 122, 1958-1971. (8) Eberhart, R.; Kennedy, J. A New Optimizer Using Particle Swarm Theory. In Micro Machine and Human Science, MHS ’95., Proceedings of the Sixth International Symposium on; 1995, 39-43. (9) Pilania, G.; Wang, C.; Jiang, X.; Rajasekaran, S.; Ramprasad, R. Accelerating Materials Property Predictions Using Machine Learning. Sci. Rep. 2013, 3, 2810. (10) Phillips, C. L.; Voth, G. A. Discovering Crystals Using Shape Matching and Machine Learning. Soft Matter 2013, 9, 8552-8568. (11) Long, A. W.; Ferguson, A. L. Nonlinear Machine Learning of Patchy Colloid Self-Assembly Pathways and Mechanisms. J. Phys. Chem. B 2014, 118, 4228-4244. (12) Long, A. W.; Zhang, J.; Granick, S.; Ferguson, A. L. Machine Learning Assembly Landscapes from Particle Tracking Data. Soft Matter 2015, 11, 8141-8153. (13) Mueller, T.; Kusne, A. G.; Ramprasad, R. Machine Learning in Materials Science: Recent Progress and Emerging Applications. Rev. Comput. Chem. 2016, 29, 186-273. (14) Srinivasan, S.; Rajan, K. “Property Phase Diagrams” for Compound Semiconductors through Data Mining. Materials 2013, 6, 279-290. (15) Hamdia, K. M.; Silani, M.; Zhuang, X.; He, P.; Rabczuk, T. Stochastic Analysis of the Fracture Toughness of Polymeric Nanoparticle Composites Using Polynomial Chaos Expansions. Int. J. Fract. 2017, 206, 215-227. (16) Hamdia, K. M.; Ghasemi, H.; Zhuang, X.; Alajlan, N.; Rabczuk, T. Sensitivity and Uncertainty Analysis for Flexoelectric Nanostructures. Comput. Methods Appl. Mech. Eng. 2018, 337, 95-109. (17) Wang, C. C.; Pilania, G.; Boggs, S. A.; Kumar, S.; Breneman, C.; Ramprasad, R. Computational Strategies for Polymer Dielectrics Design. Polymer, 2014, 55, 979-988. (18) Xu, J.; Wang, L.; Liang, G.; Wang, L.; Shen, X. A General Quantitative Structure--Property Relationship Treatment for Dielectric Constants of Polymers. Polymer Engineering & Science 2011, 51, 2408-2416. (19) Hansen, K.; Montavon, G.; Biegler, F.; Fazli, S.; Rupp, M.; Scheffler, M.; von Lilienfeld, O. A.; Tkatchenko, A.; Müller, K.-R. Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies. J. Chem. Theory Comput. 2013, 9, 34043419. (20) Xu, J.; Chen, B.; Liang, H. Accurate Prediction of θ (lower Critical Solution Temperature) in Polymer Solutions Based on 3D Descriptors and Artificial Neural Networks.

ACS Paragon Plus Environment

13

The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 15

Macromolecular Theory Simulations 2008, 17, 109-120. (21) Afantitis, A.; Melagraki, G.; Makridima, K.; Alexandridis, A.; Sarimveis, H.; IglessiMarkopoulou, O. Prediction of High Weight Polymers Glass Transition Temperature Using RBF Neural Networks. Journal of Molecular Structure: THEOCHEM 2005, 716, 193-198. (22) Joyce, S. J.; Osguthorpe, D. J.; Padgett, J. A.; Price, G. J. Neural Network Prediction of GlassTransition Temperatures from Monomer Structure. J. Chem. Soc. Faraday Trans. 1995, 91, 2491-2496. (23) Chen, X.; Sztandera, L.; Cartwright, H. M. A Neural Network Approach to Prediction of Glass Transition Temperature of Polymers. Int. J. Intell. Syst. 2008, 23, 22-32. (24) Li, Y.; Li, H.; Pickard, F. C., IV; Narayanan, B.; Sen, F. G.; Chan, M. K. Y.; Sankaranarayanan, S. K. R. S.; Brooks, B. R.; Roux, B. Machine Learning Force Field Parameters from Ab Initio Data. J. Chem. Theory Comput. 2017, 13, 4492-4503. (25) Botu, V.; Batra, R.; Chapman, J.; Ramprasad, R. Machine Learning Force Fields: Construction, Validation, and Outlook. J. Phys. Chem. C 2017, 121, 511-522. (26) Chmiela, S.; Tkatchenko, A.; Sauceda, H. E.; Poltavsky, I.; Schütt, K. T.; Müller, K.-R. Machine Learning of Accurate Energy-Conserving Molecular Force Fields. Sci. Adv. 2017, 3, e1603015. (27) Kruglov, I.; Sergeev, O.; Yanilkin, A.; Oganov, A. R. Energy-Free Machine Learning Force Field for Aluminum. Sci. Rep. 2017, 7, 8512. (28) Blank, T. B.; Brown, S. D.; Calhoun, A. W.; Doren, D. J. Neural Network Models of Potential Energy Surfaces. J. Chem. Phys. 1995, 103, 4129-4137. (29) Behler, J.; Parrinello, M. Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces. Phys. Rev. Lett. 2007, 98, 146401. (30) Bartók, A. P.; Payne, M. C.; Kondor, R.; Csányi, G. Gaussian Approximation Potentials: The Accuracy of Quantum Mechanics, without the Electrons. Phys. Rev. Lett. 2010, 104, 136403. (31) Behler, J. Perspective: Machine Learning Potentials for Atomistic Simulations. J. Chem. Phys. 2016, 145, 170901. (32) Botu, V.; Ramprasad, R. Adaptive Machine Learning Framework to Accelerate Ab Initio Molecular Dynamics. Int. J. Quantum Chem. 2015, 115, 1074-1083. (33) Hassan, A. M.; Alrashdan, A.; Hayajneh, M. T.; Mayyas, A. T. Prediction of Density, Porosity and Hardness in Aluminum–copper-Based Composite Materials Using Artificial Neural Network. J. Mater. Process. Technol. 2009, 209, 894-899. (34) Schneider, E.; Dai, L.; Topper, R. Q.; Drechsel-Grau, C.; Tuckerman, M. E. Stochastic Neural Network Approach for Learning High-Dimensional Free Energy Surfaces. Phys. Rev. Lett. 2017, 119, 150601. (35) Friederich, P.; Konrad, M.; Strunk, T.; Wenzel, W. Machine Learning of Correlated Dihedral

ACS Paragon Plus Environment

14

Page 15 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

Potentials for Atomistic Molecular Force Fields. Sci. Rep. 2018, 8, 2559. (36) Agatonovic-Kustrin, S.; Beresford, R. Basic Concepts of Artificial Neural Network (ANN) Modeling and Its Application in Pharmaceutical Research. J. Pharm. Biomed. Anal. 2000, 22, 717-727. (37) Malmberg, C. G. Dielectric Constant of Deuterium Oxide. J. Res. Natl. Bur. Stand. 1958, 60, 609-612.

ACS Paragon Plus Environment

15