Materials Informatics - ACS Publications - American Chemical Society

4 days ago - assess transport properties of solids such as finite-temperature stability as well ... data computed for a representative diverse set of ...
0 downloads 0 Views 271KB Size
Editorial Cite This: J. Chem. Inf. Model. 2018, 58, 2377−2379

pubs.acs.org/jcim

J. Chem. Inf. Model. 2018.58:2377-2379. Downloaded from pubs.acs.org by 5.8.47.139 on 12/26/18. For personal use only.

Materials Informatics described an application of Artificial Neural Network (ANN) and Random Forest (RF) to modeling of cyclic voltammetry behavior of Cobalt-doped Ceria/reduced graphene oxide (CoCeO2/rGO) nanocomposite as supercapacitor. They showed that model predictions agreed with measurements from supercapacitor module power-cycling and suggested that their models can be used for rational design of novel nanocomposites for supercapacitors. Sun et al.6 introduce a high-throughput force field simulation (HT-FFS) procedure by combining a general force field with a simulation protocol to calculate thermodynamic properties for a large number of molecules. In a proof-of-concept study, they employ the HT-FFS procedure to calculate thermodynamic properties of molecular liquids from ground level using an allatom force field. They also show that such calculations could generate large quantities of data that can then be analyzed with machine learning (ML) techniques. Expanding on the theme of using ML to calculate properties typically accessible using much more expensive computational approaches, Legrain et al.7 discuss how ML approaches can be used to calculate vibrational properties of inorganic materials that are used to assess transport properties of solids such as finite-temperature stability as well as thermal conductivity. They show that a model trained with random forest can accurately predict interatomic force constants that can then be used to estimate phonon spectral features, heat capacities, vibrational entropies, and vibrational free energies in good agreement with those calculated by ab initio quantum mechanical methods. Large-scale calculations such as those described above can be enabled by providing access to materials repository that contain information on materials structure and properties. One such repository is AFLOW.org that has been developed in the Curtarolo group at Duke University. Oses et al.8 report on the development of a new module for AFLOW.org called AFLOW-CHULL. This module provides access to more than 1.8 M structures in AFLOW.org and enables multiple functionalities including the identification of stable phases of materials, which is critical for rational planning of material synthesis. Several papers report on the use of property prediction models for identifying novel materials with the desired properties using virtual screening approaches. This aspect of materials informatics highlights its methodological similarity to cheminformatics where virtual screening is used traditionally to discover new organic molecules with the desired biological activity. Wilbraham et al.9 calibrated low-computational-cost density functional tight-binding methods to a high-level DFT data computed for a representative diverse set of conjugated polymers. This calibration enabled them to search rapidly for additional conjugated polymers with desired optical and electronic properties. In a methodologically similar study, Lu et al.10 have employed virtual screening as applied to the

The big data revolution has affected nearly all areas of science creating greater than ever opportunities and the necessity for gaining new insights via the application of informatics and modeling techniques to analyze data. Such techniques have long been used in chemistry, with this journal providing an important media for publishing respective research papers. Many approaches developed in cheminformatics and molecular modeling have found impactful applications in other disciplines that have experienced recent growth such as materials science. Materials informatics has emerged as an area of research that is engaged with the application of informatics principles to materials science in order to assist with the discovery, optimization, and development of new materials for multiple purposes. It has been growing rapidly as evidenced by the constant increase in the number of publications. Indeed, the editors of this journal have noticed a growing number of manuscript submissions in materials informatics in the last couple of years, which led us to organize this special issue. Covering all aspects of materials informatics in a single issue, as diverse as it may be, is clearly not feasible. Thus, we have solicited or selected manuscripts that cover a representative range of subjects from methodological developments to special topics to application notes introducing novel specialized webbased tools. We briefly comment below on papers that appear in this special issue. A perspective on workflows to enable materials discovery in surface science by Tran, Palizhati, and Ulissi1 is featured towards the beginning of the issue. They review available databases, models, and workflows found in the literature and then offer their own tools that can be employed to organize calculations leading to the design of novel materials. In another perspective, Cao, Li, and Mueller2 discuss the history and current state of a theoretical approach to studying materials, especially, nanostructures, that is termed cluster expansion. They show how this method can be used to predict the structures and properties of various material systems, exemplified by surfaces and nanostructured materials. Several papers in this issue discuss methods used to study, design, or discover materials with the desired properties. Blay, Yokoi, and Gonzalez-Diaz3 discuss how perturbation theory and machine learning can be combined in a PTML multioutput model. They demonstrate how this PTML model can be employed to predict up to eight different properties of zeolites leading to the model-driven design of zeolites as inorganic catalysts. In another contribution, Wagner, Puggioni, and Rondinelli4 analyze local atomic distortions in crystalline materials using statistical methods. They have developed a methodology termed amplitude (a) and normalized-amplitude (n) distortion-mode−property correlation-coefficient-heat maps, aCCHMs and nCCHMs, respectively. They show that aCCHMs are suitable for understanding distortion-mode−property relationships whereas nCCHMs can be used to understand local mode−mode dependencies within a single composition NdNiO3 and can guide experiments using nonisotropic perturbations. Khan and colleagues5 © 2018 American Chemical Society

Special Issue: Materials Informatics Published: December 24, 2018 2377

DOI: 10.1021/acs.jcim.8b00927 J. Chem. Inf. Model. 2018, 58, 2377−2379

Journal of Chemical Information and Modeling

Editorial

be porous and whether the windows are big enough to allow for guest diffusion. Recognizing the importance of incorporating molecular flexibility, pywindow also allows for the analysis of molecular dynamics trajectories performed on such compounds. The method was validated on molecular pores, metal−organic polyhedra, and framework materials either through comparison with other programs or by visual inspection of the results. In another application note, Chatzigoulas et al.16 describe their development of NanoCrystal as a web-based tool for constructing nanoparticles from any crystal structures based on their crystal habit. Modeling the structures of nanoparticles is an important first step in the study of their properties, dynamics, and potential usage. NanoCrystal builds nanoparticles from a cif containing the crystal coordinates, the Miller indices and their corresponding minimum surface energies according to the Wulff construction of a particular crystal, and the particle’s desired size. The output nanoparticle structure is visualized, and its coordinates are provided to the user. The method was validated by constructing nanoparticles of Fe3O4, TiO2, and LiFePO4 based on input obtained from ab initio calculations. The resulting structures favorably compared with structures obtained from the Vesta program and with available experimental data. As mentioned above, materials informatics is a burgeoning field as reflected by growing numbers of research papers on this subject. It is the hope of the editors that this issue will serve both as an entry point for newcomers and a source of additional information for scientists already working in materials informatics. We expect that materials informatics will become one of the areas of research covered by this journal on an ongoing basis.

library of over 30 M organic molecules to select candidates for three key layers of an OLED device, the hole- and electrontransport layers (HTL, ETL) and an emissive layer (EML). They demonstrate that their algorithm enables the synergistic selection of viable candidate compounds for all three layers. Shi et al.11 report on the development of a model to predict specific surface area (SSA) of ABO3-type perovskite. The have established a web server that implemented their model to enable users to search for additional perovskite materials with a high SSA. It is important to assess not only physical and/or electronic properties of materials but also their biological properties such as toxicity. To this end, we have included a paper by Barycki et al.12 that demonstrates how one can estimate cellular toxicity of ionic liquids using a multiobjective genetic algorithm. Similar to other informatics-based disciplines, data visualization via dimensionality reduction is also highly important in materials informatics. Visualizing data can provide an intuitive grasp of the overall data structure including the formation of clusters and the identification of outliers and activity cliffs. In addition, data visualization could be used to classify new samples into existing groups (e.g., clustering). However, in contrast with cheminformatics, data visualization is much less explored in materials informatics. Two manuscripts in this special issue are engaged with the topic of data visualization. Kaneko13 used the sparse generative topographic mapping (SGTM) method, a variant of the well-known GTM algorithm for simultaneous data visualization and clustering in two dimensions (2D). The new SGTM algorithm was first validated on synthetic data to check its effectiveness in visualizing and clustering a data set with multimodal data distributions as well as a data set with a continuous, nonlinear distribution. Next the algorithm was validated on a set of compounds with experimentally measured solubility data, each characterized by 10 descriptors and found to be effective in reducing the 10-dimensional space into 2 dimensions while placing compounds with similar solubility data in the same cluster. Similar results were also obtained for a set of compounds characterized by toxicity data. Along similar lines, Kaspi et al.14 have compared the performances of four dimensionality reduction methods (PCA, Kernel-PCA, Isomap, and Diffusion Map) for treating a seven-dimension activity space measured for a database of solar cells entirely made of metal oxides, using several metrics. They found that “classical” PCA performed the best in terms of its ability to correctly maintain the original (i.e., in the high-dimensions space) local environment of the samples in the lower dimensional space. In contrast, the nonlinear Isomap method performed the best in assigning class membership based on the identity of nearest neighbors, i.e., it was found to be the best classifier. Moreover, the analysis was able to rationalize many of the outliers identified by all methods. Dissemination of methods and models in the form of web portals is characteristic of modern science in general, which is why JCIM recently established an application note category of papers. Two such papers are included in this issue. Miklitz and Jelfs15 presented the development of pywindow, a pythonbased package for the automated analysis of the structural properties of porous materials. Properties such as cavity diameters, the number of windows and their diameters, and the molecular diameters, which are important for porous compounds, can be studied. In addition, pywindow could be used to determine whether a material has a predisposition to

Hanoch Senderowitz*,† Alexander Tropsha*,‡ †



Department of Chemistry, Bar Ilan University, Ramat-Gan 5290002, Israel ‡ Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina 27599, United States

AUTHOR INFORMATION

Corresponding Authors

*E-mail: [email protected]. *E-mail: [email protected]. ORCID

Hanoch Senderowitz: 0000-0003-0076-1355 Alexander Tropsha: 0000-0003-3802-8896 Notes

Views expressed in this editorial are those of the authors and not necessarily the views of the ACS.



REFERENCES

(1) Tran, K.; Palizhati, A.; Back, S.; Ulissi, Z. W. Dynamic Workflows for Routine Materials Discovery in Surface Science. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00386. (2) Cao, L.; Li, C.; Mueller, T. The Use of Cluster Expansions To Predict the Structures and Properties of Surfaces and Nanostructured Materials. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00413. (3) Blay, V.; Yokoi, T.; Gonzalez-Diaz, H. Perturbation TheoryMachine Learning Study of Zeolite Materials Desilication. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00383.

2378

DOI: 10.1021/acs.jcim.8b00927 J. Chem. Inf. Model. 2018, 58, 2377−2379

Journal of Chemical Information and Modeling

Editorial

(4) Wagner, N.; Puggioni, D.; Rondinelli, J. M. Learning from Correlations Based on Local Structure: Rare-Earth Nickelates Revisited. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00411. (5) Parwaiz, S.; Malik, O. A.; Pradhan, D.; Khan, M. M. MachineLearning-Based Cyclic Voltammetry Behavior Model for Supercapacitance of Co-Doped Ceria/rGO Nanocomposite. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00612. (6) Gong, Z.; Wu, Y.; Wu, L.; Sun, H. Predicting Thermodynamic Properties of Alkanes by High-Throughput Force Field Simulation and Machine Learning. J. Chem. Inf. Model. 2018, DOI: 10.1021/ acs.jcim.8b00407. (7) Legrain, F.; van Roekeghem, A.; Curtarolo, S.; et al. Vibrational Properties of Metastable Polymorph Structures by Machine Learning. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00279. (8) Oses, C.; Gossett, E.; Curtarolo, S.; et al. AFLOW-CHULL: Cloud-Oriented Platform for Autonomous Phase Stability Analysis. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00393. (9) Wilbraham, L.; Berardo, E.; Turcani, L.; et al. High-Throughput Screening Approach for the Optoelectronic Properties of Conjugated Polymers. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00256. (10) Lu, S.-Y.; Mukhopadhyay, S.; Froese, R.; Zimmerman, P. M. Virtual Screening of Hole Transport, Electron Transport, and Host Layers for Effective OLED Design. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00044. (11) Shi, L.; Chang, D.; Ji, X.; Lu, W. Using Data Mining To Search for Perovskite Materials with Higher Specific Surface Area. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00436. (12) Barycki, M.; Sosnowska, A.; Jagiello, K.; Puzyn, T. MultiObjective Genetic Algorithm (MOGA) As a Feature Selecting Strategy in the Development of Ionic Liquids Quantitative ToxicityToxicity Relationship Models. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00378. (13) Kaneko, H. Sparse Generative Topographic Mapping for Both Data Visualization and Clustering. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00528. (14) Kaspi, O.; Yosipof, A.; Senderowitz, H. Visualization of Solar Cell Library Space by Dimensionality Reduction Methods. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00552. (15) Miklitz, M.; Jelfs, K. E. pywindow: Automated Structural Analysis of Molecular Pores. J. Chem. Inf. Model. 2018, DOI: 10.1021/ acs.jcim.8b00490. (16) Chatzigoulas, A.; Karathanou, K.; Dellis, D.; Cournia, Z. NanoCrystal: A Web-Based Crystallographic Tool for the Construction of Nanoparticles Based on Their Crystal Habit. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00269.

2379

DOI: 10.1021/acs.jcim.8b00927 J. Chem. Inf. Model. 2018, 58, 2377−2379