GPGPUs have been increasingly successful in the past years in many application domains, due to their high parallel processing capabilities and energy performance. More recently, they started to be used in areas (such as automotive) where safety is also an important parameter. However, their architectural complexity and advanced technology level create challenges when matching the required reliability targets. This requires devising solutions to perform in-field test, thus allowing the systematic detection of possible permanent faults. These faults are caused by aging or external factors that affect the application execution and potentially generate critical misbehaviors. Moreover, effective infield test techniques oriented to verify the integrity of GPGPU modules during in-field operation are still missed. In this work, we propose a method to generate self-test procedures able to detect all static faults affecting the scheduler memory existing in each streaming multiprocessor (SM) of a GPGPU. NVIDIA CUDA-C is selected as high-level programing language. The experimental results are obtained employing the NVIDIA Nsight Debugger on a NVIDIA-GEFORCE GTX GPU and a memory fault simulator.
On the in-field test of the GPGPU scheduler memory / Di Carlo, Stefano; Condia, Josie E. Rodriguez; Sonza Reorda, Matteo. - STAMPA. - (2019), pp. 1-6. (Intervento presentato al convegno IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS) tenutosi a Cluj-Napoca, (Romania) nel 24-26 April 2019) [10.1109/DDECS.2019.8724672].
On the in-field test of the GPGPU scheduler memory
Di Carlo, Stefano;Condia, Josie E. Rodriguez;Sonza Reorda, Matteo
2019
Abstract
GPGPUs have been increasingly successful in the past years in many application domains, due to their high parallel processing capabilities and energy performance. More recently, they started to be used in areas (such as automotive) where safety is also an important parameter. However, their architectural complexity and advanced technology level create challenges when matching the required reliability targets. This requires devising solutions to perform in-field test, thus allowing the systematic detection of possible permanent faults. These faults are caused by aging or external factors that affect the application execution and potentially generate critical misbehaviors. Moreover, effective infield test techniques oriented to verify the integrity of GPGPU modules during in-field operation are still missed. In this work, we propose a method to generate self-test procedures able to detect all static faults affecting the scheduler memory existing in each streaming multiprocessor (SM) of a GPGPU. NVIDIA CUDA-C is selected as high-level programing language. The experimental results are obtained employing the NVIDIA Nsight Debugger on a NVIDIA-GEFORCE GTX GPU and a memory fault simulator.File | Dimensione | Formato | |
---|---|---|---|
camera ready.pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
862.8 kB
Formato
Adobe PDF
|
862.8 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2736932
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo