Recent graphic processing units (GPUs) have remarkable raw computing power, which can be used for very computationally challenging problems. Like in micromagnetic simulations, where the magnetostatic field computation to analyze the magnetic behavior at very small time and space scale demands a huge computation time. This paper presents a multidimensional FFT-based parallel implementation of a magnetostatic field computation on GPUs. We have developed a specialized 3D FFT library for magnetostatic field calculation on GPUs. This made it possible to fully exploit the symmetries inherent in the field calculation and other optimizations specific to the GPUs architecture. We have compared our results with the widely used CPU-based parallel OOMMF program and with an equivalent serial implementation on CPU. The results have shown a speedup of up to 95x and 8.7x for single and 66x and 4.6x for double precision floating point accuracy against equivalent serial implementation and OOMMF, respectively.
An optimized magnetostatic field solver on GPU using open computing language / Khan, Fiaz Gul; Montrucchio, Bartolomeo; Jan, Bilal; Khan, Abdul Nasir; Jadoon, Waqas; Shamshirband, Shahaboddin; Chronopoulos, Anthony Theodore; Khan, Iftikhar Ahmed. - In: CONCURRENCY AND COMPUTATION. - ISSN 1532-0626. - STAMPA. - 29:5(2017), p. e3981. [10.1002/cpe.3981]
An optimized magnetostatic field solver on GPU using open computing language
MONTRUCCHIO, BARTOLOMEO;
2017
Abstract
Recent graphic processing units (GPUs) have remarkable raw computing power, which can be used for very computationally challenging problems. Like in micromagnetic simulations, where the magnetostatic field computation to analyze the magnetic behavior at very small time and space scale demands a huge computation time. This paper presents a multidimensional FFT-based parallel implementation of a magnetostatic field computation on GPUs. We have developed a specialized 3D FFT library for magnetostatic field calculation on GPUs. This made it possible to fully exploit the symmetries inherent in the field calculation and other optimizations specific to the GPUs architecture. We have compared our results with the widely used CPU-based parallel OOMMF program and with an equivalent serial implementation on CPU. The results have shown a speedup of up to 95x and 8.7x for single and 66x and 4.6x for double precision floating point accuracy against equivalent serial implementation and OOMMF, respectively.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2674595
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo