Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published July 2018 | Published
Journal Article Open

StrassenNets: Deep Learning with a Multiplication Budget

Abstract

A large fraction of the arithmetic operations required to evaluate deep neural networks (DNNs) consists of matrix multiplications, in both convolution and fully connected layers. We perform end-to-end learning of low-cost approximations of matrix multiplications in DNN layers by casting matrix multiplications as 2-layer sum-product networks (SPNs) (arithmetic circuits) and learning their (ternary) edge weights from data. The SPNs disentangle multiplication and addition operations and enable us to impose a budget on the number of multiplication operations. Combining our method with knowledge distillation and applying it to image classification DNNs (trained on ImageNet) and language modeling DNNs (using LSTMs), we obtain a first-of-a-kind reduction in number of multiplications (over 99.5%) while maintaining the predictive performance of the full-precision models. Finally, we demonstrate that the proposed framework is able to rediscover Strassen's matrix multiplication algorithm, learning to multiply 2×2 matrices using only 7 multiplications instead of 8.

Additional Information

© 2018 by the author(s). The authors would like to thank Eirikur Agustsson, Helmut Bölcskei, Lukas Cavigelli, Asmus Hetzel, Risi Kondor, Andrew Lavin, Michael Lerjen, Zachary Lipton, Weitang Liu, Andrea Olgiati, John Owens, Sheng Zha, and Zhi Zhang for inspiring discussions and comments. This work was supported by the "AWS Cloud Credits for Research" program.

Attached Files

Published - tschannen18a.pdf

Files

tschannen18a.pdf
Files (363.6 kB)
Name Size Download all
md5:01816f2c84ae69e9016aed8c02ce922e
363.6 kB Preview Download

Additional details

Created:
August 19, 2023
Modified:
October 20, 2023