Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published June 14, 2021 | Submitted + Published
Journal Article Open

Low communication high performance ab initio density matrix renormalization group algorithms

Abstract

There has been recent interest in the deployment of ab initio density matrix renormalization group (DMRG) computations on high performance computing platforms. Here, we introduce a reformulation of the conventional distributed memory ab initio DMRG algorithm that connects it to the conceptually simpler and advantageous sum of the sub-Hamiltonian approach. Starting from this framework, we further explore a hierarchy of parallelism strategies that includes (i) parallelism over the sum of sub-Hamiltonians, (ii) parallelism over sites, (iii) parallelism over normal and complementary operators, (iv) parallelism over symmetry sectors, and (v) parallelism within dense matrix multiplications. We describe how to reduce processor load imbalance and the communication cost of the algorithm to achieve higher efficiencies. We illustrate the performance of our new open-source implementation on a recent benchmark ground-state calculation of benzene in an orbital space of 108 orbitals and 30 electrons, with a bond dimension of up to 6000, and a model of the FeMo cofactor with 76 orbitals and 113 electrons. The observed parallel scaling from 448 to 2800 central processing unit cores is nearly ideal.

Additional Information

© 2021 Published under an exclusive license by AIP Publishing. Submitted: 19 March 2021; Accepted: 21 May 2021; Published Online: 14 June 2021. This work was supported by the U.S. National Science Foundation (NSF) (Grant No. CHE-2102505). H.Z. thanks Seunghoon Lee for providing the integrals and reference DMRG outputs for the benzene system, and Henrik R. Larsson, Zhi-Hao Cui, and Tianyu Zhu for helpful discussions. The computations presented in this work were conducted on the Caltech High Performance Cluster, partially supported by a grant from the Gordon and Betty Moore Foundation. Data Availability: The performance data presented in this work can be reproduced using the Block2 code51 and the integral files53 provided in Refs. 20 and 45.

Attached Files

Published - 224116_1_online.pdf

Submitted - 2103-09976.pdf

Files

2103-09976.pdf
Files (5.6 MB)
Name Size Download all
md5:9db0e0eb3d4f4791115e31ce13311b92
585.0 kB Preview Download
md5:00317e232e64921c7c4f2dd9f4386cf5
5.0 MB Preview Download

Additional details

Created:
October 3, 2023
Modified:
October 24, 2023