Machine Learning Energy Gaps of Porphyrins with Molecular Graph

supervised machine learning relies on labeled data to build a mapping of feature representations onto properties of interest using regression or ... d...
0 downloads 7 Views 11MB Size
Subscriber access provided by La Trobe University Library

A: Molecular Structure, Quantum Chemistry, and General Theory

Machine Learning Energy Gaps of Porphyrins with Molecular Graph Representations Zheng Li, Noushin Omidvar, Wei Shan Chin, Esther Robb, Amanda J Morris, Luke E. K. Achenie, and Hongliang Xin J. Phys. Chem. A, Just Accepted Manuscript • DOI: 10.1021/acs.jpca.8b02842 • Publication Date (Web): 24 Apr 2018 Downloaded from http://pubs.acs.org on April 24, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Machine Learning Energy Gaps of Porphyrins with Molecular Graph Representations Zheng Li,† Noushin Omidvar,† Wei Shan Chin,† Esther Robb,† Amanda Morris,‡ Luke Achenie,† and Hongliang Xin∗,† †Department of Chemical Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061 ‡Department of Chemistry, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061 E-mail: [email protected]

1 ACS Paragon Plus Environment

The Journal of Physical Chemistry

Abstract

Molecular functionalization of porphyrins opens countless new opportunities in tailoring their physicochemical properties for light-harvesting applications. However, the immense materials space spanned by a vast number of substituent ligands and chelating metal ions prohibits high-throughput screening of combinatorial libraries. In this work, machine learning algorithms equipped with the domain knowledge of chemical graph theory were employed for predicting the energy gaps of >12000 porphyrins from the Computational Materials Repository. Among a variety of graph-based molecular descriptors, the electrotopological-state index, which encodes electronic and topological structure information, captures the energy gaps of porphyrins with a prediction RMSE 12,000 molecular structures of porphyrins and DFT-calculated properties, e.g., frontier orbital energy levels, optical gaps, and energy gaps. In this study, we focus on the energy gap calculated as the difference between the electron affinity and ionization potential, i.e., Eea − Eip , because of its importance in determining efficiencies of solar light absorption and energy transfer. 11–19 By varying the side groups R1 , R2 , R3 , and the anchor group R4 at meso-positions, peripheral substituents L at β -positions, and chelating metal ions M as denoted in Fig. 1, a theoretically unlimited number of porphyrins can be conceived and synthesized. In the dataset, the side groups are aromatic ligands and the anchor group is connected to the methine bridge via a carbon-carbon double or triple bond and a carboxylate group as the anchor point to semiconducting supports, e.g, TiO2 . 11,12 For the porphyrins in the database, R1 and R3 ligands are kept the same, while R4 has two rotational configurations with respect to the 3 ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Side Group R1

L

Page 4 of 23

L

L

L

Side Group R2

Metal Center M

Anchor Group R4

L

L L Side Group R3

L

Figure 1: Structural labeling scheme of porphyrins with varying ligand substitution and metal chelation.

porphyrin plane. A complete list of functional groups used in the dataset can be found in the Computational Materials Repository. 55 To have an overview of the distribution of porphyrins and their properties across the dataset, violin plots in Fig. S1 show that the metal chelating has little influence on the energy gaps of porphyrins (