In this paper, we present our implementation of the fast Fourier transforms on graphic processing unit (GPU) using OpenCL. This implementation of the FFT (ToPe-FFT) is based on the Cooley-Tukey set of algorithms with support for 1D and higher dimensional transforms using dif- ferent radices. Factorization for mix-radices enables our code to target FFTs of near arbitrary length. In systems with multiple graphic cards (GPUs), the library automatically balances the FFT computation thus achieving maximum resource utilization and higher speedup. Based on profiling and micro-benchmarking of ToPe-FFT, it is observed that the average speedup of our library for different sizes is 48× faster than the single CPU-based code using FFTW and 3× faster than NVIDIA's GPU-based cuFFT library.
Introducing ToPe-FFT: An OpenCL-based FFT library targeting GPUs / Jan, Bilal; Khan, Fiaz Gul; Montrucchio, Bartolomeo; Chronopoulos, Anthony Theodore; Shamshirband, Shahaboddin; Khan, Abdul Nasir. - In: CONCURRENCY AND COMPUTATION. - ISSN 1532-0626. - ELETTRONICO. - 29:21(2017), pp. e4256-e4269. [10.1002/cpe.4256]
Introducing ToPe-FFT: An OpenCL-based FFT library targeting GPUs
Montrucchio, Bartolomeo;
2017
Abstract
In this paper, we present our implementation of the fast Fourier transforms on graphic processing unit (GPU) using OpenCL. This implementation of the FFT (ToPe-FFT) is based on the Cooley-Tukey set of algorithms with support for 1D and higher dimensional transforms using dif- ferent radices. Factorization for mix-radices enables our code to target FFTs of near arbitrary length. In systems with multiple graphic cards (GPUs), the library automatically balances the FFT computation thus achieving maximum resource utilization and higher speedup. Based on profiling and micro-benchmarking of ToPe-FFT, it is observed that the average speedup of our library for different sizes is 48× faster than the single CPU-based code using FFTW and 3× faster than NVIDIA's GPU-based cuFFT library.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2702004
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo