Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published July 28, 2022 | Submitted + Published + Supplemental Material
Journal Article Open

Informing geometric deep learning with electronic interactions to accelerate quantum chemistry

Abstract

Predicting electronic energies, densities, and related chemical properties can facilitate the discovery of novel catalysts, medicines, and battery materials. However, existing machine learning techniques are challenged by the scarcity of training data when exploring unknown chemical spaces. We overcome this barrier by systematically incorporating knowledge of molecular electronic structure into deep learning. By developing a physics-inspired equivariant neural network, we introduce a method to learn molecular representations based on the electronic interactions among atomic orbitals. Our method, OrbNet-Equi, leverages efficient tight-binding simulations and learned mappings to recover high-fidelity physical quantities. OrbNet-Equi accurately models a wide spectrum of target properties while being several orders of magnitude faster than density functional theory. Despite only using training samples collected from readily available small-molecule libraries, OrbNet-Equi outperforms traditional semiempirical and machine learning–based methods on comprehensive downstream benchmarks that encompass diverse main-group chemical processes. Our method also describes interactions in challenging charge-transfer complexes and open-shell systems. We anticipate that the strategy presented here will help to expand opportunities for studies in chemistry and materials science, where the acquisition of experimental or reference training data is costly.

Additional Information

© 2022 the Author(s). Published by PNAS. This article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND). Edited by Klavs Jensen, Massachusetts Institute of Technology, Cambridge, MA; received April 1, 2022; accepted June 6, 2022. Published July 28, 2022. Z.Q. acknowledges graduate research funding from Caltech and partial support from the Amazon–Caltech AI4Science fellowship. A.A. and T.F.M. acknowledge partial support from the Caltech DeLogi fund, and A.A. acknowledges support from a Caltech Bren professorship. Z.Q. acknowledges Bo Li, Vignesh Bhethanabotla, Dani Kiyasseh, Hongkai Zheng, Sahin Lale, and Rafal Kocielnik for proofreading and helpful comments on the manuscript. Author contributions: Z.Q., F.R.M., A.A., and T.F.M. designed research; Z.Q. performed research; A.S.C. and M.W. contributed new reagents/analytic tools; Z.Q. and A.S.C. analyzed data; F.R.M. and A.A. contributed to the theoretical results; and Z.Q., A.A., and T.F.M. wrote the paper. Competing interest statement: A patent application related to this work has been filed. A.S.C., M.W., F.R.M., and T.F.M. are employees of Entos, Inc. or its affiliates. The software used for computing input features and gradients is proprietary to Entos, Inc. Data Availability: Source data for results described in the text and SI Appendix, the training dataset, code, and evaluation examples have been deposited in Zenodo (https://zenodo.org/record/6568518#.YrtTKHbMK38) (99). This article is a PNAS Direct Submission.

Attached Files

Published - pnas.2205221119.pdf

Submitted - 2105.14655.pdf

Supplemental Material - pnas.2205221119.sapp.pdf

Files

pnas.2205221119.sapp.pdf
Files (7.2 MB)
Name Size Download all
md5:3b047854d53940d3fe491d69a26ae817
2.2 MB Preview Download
md5:bfb30a819413511ded30d546ded3f382
2.1 MB Preview Download
md5:c72eff7030d1d83873ed94f2ae2a587e
2.9 MB Preview Download

Additional details

Created:
August 22, 2023
Modified:
October 23, 2023