Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published December 2021 | Accepted Version + Published
Journal Article Open

Fast convolutional neural networks on FPGAs with hls4ml

Abstract

We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs). By extending the hls4ml library, we demonstrate an inference latency of 5 µs using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device used in trigger and data acquisition systems of particle detectors. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be significantly reduced with little to no loss in model accuracy. We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation.

Additional Information

© 2021 The Author(s). Published by IOP Publishing Ltd. Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. Received 15 January 2021; Accepted 25 June 2021; Published 16 July 2021. We acknowledge the Fast Machine Learning collective as an open community of multi-domain experts and collaborators. This community was important for the development of this project. M P, S S and V L are supported by the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (Grant Agreement No. 772369). S J, M L, K P, and N T are supported by Fermi Research Alliance, LLC under Contract No. DE-AC02-07CH11359 with the U.S. Department of Energy (DOE), Office of Science, Office of High Energy Physics. P H is supported by a Massachusetts Institute of Technology University grant. Z W is supported by the National Science Foundation under Grant Nos. 1606321 and 115164. J D is supported by the DOE, Office of Science, Office of High Energy Physics Early Career Research program under Award No. DE-SC0021187. Data availability statement: The data that support the findings of this study are openly available. Code availability statement: The hls4ml library is available at https://github.com/fastmachinelearning/hls4ml and archived in the Zenodo platform at 10.5281/zenodo.4161550. The work presented here is based on the Bartsia release, version 0.5.0. For examples on how to use hls4ml, the notebooks in https://github.com/fastmachinelearning/hls4ml-tutorial serve as a general introduction. The QKeras library, which also includes AutoQKeras and QTools, is available at https://github.com/google/qkeras. The SVHN dataset [17] can be downloaded at http://ufldl.stanford.edu/housenumbers or through TensorFlow Datasets at www.tensorflow.org/datasets/catalog/svhn_cropped.

Attached Files

Published - Aarrestad_2021_Mach._Learn.__Sci._Technol._2_045015.pdf

Accepted Version - 2101.05108.pdf

Files

2101.05108.pdf
Files (7.8 MB)
Name Size Download all
md5:6e7e198e17d9252375559f5530430ecc
5.5 MB Preview Download
md5:6446b5f6b16b0ce3bb98104f00faae83
2.3 MB Preview Download

Additional details

Created:
August 20, 2023
Modified:
October 23, 2023