Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published November 28, 2021 | Supplemental Material + Submitted + Published
Journal Article Open

OrbNet Denali: A machine learning potential for biological and organic chemistry with semi-empirical cost and DFT accuracy

Abstract

We present OrbNet Denali, a machine learning model for an electronic structure that is designed as a drop-in replacement for ground-state density functional theory (DFT) energy calculations. The model is a message-passing graph neural network that uses symmetry-adapted atomic orbital features from a low-cost quantum calculation to predict the energy of a molecule. OrbNet Denali is trained on a vast dataset of 2.3 × 10⁶ DFT calculations on molecules and geometries. This dataset covers the most common elements in biochemistry and organic chemistry (H, Li, B, C, N, O, F, Na, Mg, Si, P, S, Cl, K, Ca, Br, and I) and charged molecules. OrbNet Denali is demonstrated on several well-established benchmark datasets, and we find that it provides accuracy that is on par with modern DFT methods while offering a speedup of up to three orders of magnitude. For the GMTKN55 benchmark set, OrbNet Denali achieves WTMAD-1 and WTMAD-2 scores of 7.19 and 9.84, on par with modern DFT functionals. For several GMTKN55 subsets, which contain chemical problems that are not present in the training set, OrbNet Denali produces a mean absolute error comparable to those of DFT methods. For the Hutchison conformer benchmark set, OrbNet Denali has a median correlation coefficient of R² = 0.90 compared to the reference DLPNO-CCSD(T) calculation and R² = 0.97 compared to the method used to generate the training data (ωB97X-D3/def2-TZVP), exceeding the performance of any other method with a similar cost. Similarly, the model reaches chemical accuracy for non-covalent interactions in the S66x10 dataset. For torsional profiles, OrbNet Denali reproduces the torsion profiles of ωB97X-D3/def2-TZVP with an average mean absolute error of 0.12 kcal/mol for the potential energy surfaces of the diverse fragments in the TorsionNet500 dataset.

Additional Information

© 2021 Author(s). Published under an exclusive license by AIP Publishing. Submitted: 1 July 2021; Accepted: 26 October 2021; Published Online: 23 November 2021. Z.Q. acknowledges graduate research funding from Caltech and partial support from the Amazon–Caltech AI4Science fellowship. T.F.M. and A.A. acknowledge partial support from the Caltech DeLogi fund, and A.A. acknowledges support from a Caltech Bren professorship. The authors acknowledge NVIDIA, including Abe Stern, Thorsten Kurth, Josh Romero, and Tom Gibbs, for helpful discussions regarding GPU implementations of graph neural networks. Computational resources were provided by the National Energy Research Scientific Computing Center (NERSC), a DOE Office of Science User Facility supported by the DOE Office of Science, under Contract No. DE-AC02-05CH11231. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. Conflict of Interest: Nine of the authors (A.S.C., S.K.S., M.B.O., D.G.A.S., F.D., P.J.B., M.W., F.R.M., and T.F.M.) are employees of Entos, Inc., or its affiliates. Author Contributions: A.S.C. and S.K.S. contributed equally to this work. Data Availability: The 2.3 × 10⁶ geometries and energy labels in the OrbNet Denali training set are openly available in FigShare at https://doi.org/10.6084/m9.figshare.14883867.

Attached Files

Published - 5.0061990.pdf

Submitted - 2107.00299.pdf

Supplemental Material - si.pdf

Files

si.pdf
Files (7.5 MB)
Name Size Download all
md5:6329a0b81098d0d40c35e6004d62895a
340.2 kB Preview Download
md5:5f06da6bcb4a95162703ddddbdeaf027
1.3 MB Preview Download
md5:ff96dbfd62a7df9fbd74946e63b13d6a
5.8 MB Preview Download

Additional details

Created:
August 20, 2023
Modified:
October 23, 2023