Two approaches to the concurrent implementation of the prime factor algorithm on a hypercube
Abstract
On sequential computers, the prime factor algorithm (PFA) allows the Computation of the discrete Fourier transform (DFT) with a higher efficiency than the traditional Cooley‐Tukey FFT algorithm (CTA). However, the PFA requires substantial data movement, which poses a challenging problem for distributed‐memory multi‐processor systems. In this paper, two approaches for a concurrent implementation of the PFA on these structures are presented. In the first approach, the concurrent PFA runs on all nodes of the multi‐processor system, which is inefficient on large configurations due to the large communication overhead. A second approach developed to reduce this bottleneck is also presented. These solutions have been benchmarked on Caltech hypercubes, and the performances achieved are reported. In both approaches, the crystal_router algorithm was exploited as a concurrent technique for communicating data among nodes.
Additional Information
© 1991 John Wiley & Sons, Ltd. Manuscript revised: 22 March 1991; Manuscript received: 10 August 1989. Funding Information: Italian Space Agency. Grant Numbers: ASI‐90‐RS‐69, ASI‐90‐RS‐102.Additional details
- Eprint ID
- 85538
- Resolver ID
- CaltechAUTHORS:20180330-132951062
- Agenzia Spaziale Italiana (ASI)
- ASI-90-RS-69
- Agenzia Spaziale Italiana (ASI)
- ASI-90-RS-102
- Created
-
2018-03-30Created from EPrint's datestamp field
- Updated
-
2021-11-15Created from EPrint's last_modified field