Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published March 28, 2021 | Submitted + Supplemental Material + Published
Journal Article Open

Analytical gradients for molecular-orbital-based machine learning

Abstract

Molecular-orbital-based machine learning (MOB-ML) enables the prediction of accurate correlation energies at the cost of obtaining molecular orbitals. Here, we present the derivation, implementation, and numerical demonstration of MOB-ML analytical nuclear gradients, which are formulated in a general Lagrangian framework to enforce orthogonality, localization, and Brillouin constraints on the molecular orbitals. The MOB-ML gradient framework is general with respect to the regression technique (e.g., Gaussian process regression or neural networks) and the MOB feature design. We show that MOB-ML gradients are highly accurate compared to other ML methods on the ISO17 dataset while only being trained on energies for hundreds of molecules compared to energies and gradients for hundreds of thousands of molecules for the other ML methods. The MOB-ML gradients are also shown to yield accurate optimized structures at a computational cost for the gradient evaluation that is comparable to a density-corrected density functional theory calculation.

Additional Information

© 2021 Published under license by AIP Publishing. Submitted: 16 December 2020; Accepted: 2 March 2021; Published Online: 25 March 2021. This work was supported, in part, by the U.S. Army Research Laboratory (Grant No. W911NF-12-2-0023), the U.S. Department of Energy (Grant No. DE-SC0019390), the Caltech DeLogi Fund, and the Camille and Henry Dreyfus Foundation (Award No. ML-20-196). S.J.R.L. thanks the Molecular Software Sciences Institute (MolSSI) for a MolSSI investment fellowship. T.H. acknowledges funding through an Early Post-Doc Mobility Fellowship by the Swiss National Science Foundation (Award No. P2EZP2_184234). Computational resources were provided by the National Energy Research Scientific Computing Center (NERSC), a DOE Office of Science User Facility supported by the DOE Office of Science under Contract No. DE-AC02-05CH11231. Data Availability: The data that support the findings of this study are available within the article and its supplementary material. The dataset used in Table I and Fig. 1 is available from Ref. 60. The dataset used in Fig. 2 is available from Ref. 60. The dataset used in Table II and Fig. 3 is available from Ref. 14.

Attached Files

Published - 124120_1_online.pdf

Submitted - 2012_08899.pdf

Supplemental Material - si.pdf

Files

si.pdf
Files (2.6 MB)
Name Size Download all
md5:cc8dcb26524ecea0896390700feb3fab
205.2 kB Preview Download
md5:27652d51ea9ce93e88f3d3ca11e3f94e
1.7 MB Preview Download
md5:324afbfcc363a2f9932b89e71cbc00b2
721.2 kB Preview Download

Additional details

Created:
October 3, 2023
Modified:
October 24, 2023