Pipelining Saturated Accumulation
Abstract
Aggressive pipelining and spatial parallelism allow integrated circuits (e.g., custom VLSI, ASICs, and FPGAs) to achieve high throughput on many Digital Signal Processing applications. However, cyclic data dependencies in the computation can limit parallelism and reduce the efficiency and speed of an implementation. Saturated accumulation is an important example where such a cycle limits the throughput of signal processing applications. We show how to reformulate saturated addition as an associative operation so that we can use a parallel-prefix calculation to perform saturated accumulation at any data rate supported by the device. This allows us, for example, to design a 16-bit saturated accumulator which can operate at 280 MHz on a Xilinx Spartan-3(XC3S-5000-4) FPGA, the maximum frequency supported by the component's DCM.
Additional Information
© Copyright 2009 IEEE. Reprinted with permission. Manuscript received 22 July 2007; revised 31 Dec. 2007; accepted 25 June 2008; published online 16 July 2008. Recommended for acceptance by P. Kornerup, P. Montuschi, J.-M. Muller, and E. Schwarz. This research was funded in part by the US National Science Foundation under Grant CCR-0205471. Stephanie Chan was supported by the Marcella Bonsall SURF Fellowship. Karl Papadantonakis was supported by a Moore Fellowship. Scott Weber and Eylon Caspi developed early FPGA implementations of ADPCM which helped identify this challenge. Michael Wrighton provided VHDL coding and CAD tool usage tips.Attached Files
Published - PAPieeetc09.pdf
Files
Name | Size | Download all |
---|---|---|
md5:40fd068497f091ae6122d500836989af
|
2.5 MB | Preview Download |
Additional details
- Eprint ID
- 13060
- Resolver ID
- CaltechAUTHORS:PAPieeetc09
- National Science Foundation
- CCR-0205471
- Marcella Bonsall SURF Fellowship, Caltech
- Gordon and Betty Moore Foundation
- Created
-
2009-01-16Created from EPrint's datestamp field
- Updated
-
2021-11-08Created from EPrint's last_modified field