GPGPUs have been increasingly successful in the past years in many application domains, due to their high parallel processing capabilities and energy performance. More recently, they started to be used in areas (such as automotive) where safety is also an important parameter. However, their architectural complexity and advanced technology level create challenges when matching the required reliability targets. This requires devising solutions to perform in-field test, thus allowing the systematic detection of possible permanent faults. These faults are caused by aging or external factors that affect the application execution and potentially generate critical misbehaviors. Moreover, effective infield test techniques oriented to verify the integrity of GPGPU modules during in-field operation are still missed. In this work, we propose a method to generate self-test procedures able to detect all static faults affecting the scheduler memory existing in each streaming multiprocessor (SM) of a GPGPU. NVIDIA CUDA-C is selected as high-level programing language. The experimental results are obtained employing the NVIDIA Nsight Debugger on a NVIDIA-GEFORCE GTX GPU and a memory fault simulator.

On the in-field test of the GPGPU scheduler memory / Di Carlo, Stefano; Condia, Josie E. Rodriguez; Sonza Reorda, Matteo. - STAMPA. - (2019), pp. 1-6. (Intervento presentato al convegno IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS) tenutosi a Cluj-Napoca, (Romania) nel 24-26 April 2019) [10.1109/DDECS.2019.8724672].

On the in-field test of the GPGPU scheduler memory

Di Carlo, Stefano;Condia, Josie E. Rodriguez;Sonza Reorda, Matteo
2019

Abstract

GPGPUs have been increasingly successful in the past years in many application domains, due to their high parallel processing capabilities and energy performance. More recently, they started to be used in areas (such as automotive) where safety is also an important parameter. However, their architectural complexity and advanced technology level create challenges when matching the required reliability targets. This requires devising solutions to perform in-field test, thus allowing the systematic detection of possible permanent faults. These faults are caused by aging or external factors that affect the application execution and potentially generate critical misbehaviors. Moreover, effective infield test techniques oriented to verify the integrity of GPGPU modules during in-field operation are still missed. In this work, we propose a method to generate self-test procedures able to detect all static faults affecting the scheduler memory existing in each streaming multiprocessor (SM) of a GPGPU. NVIDIA CUDA-C is selected as high-level programing language. The experimental results are obtained employing the NVIDIA Nsight Debugger on a NVIDIA-GEFORCE GTX GPU and a memory fault simulator.
2019
978-1-7281-0073-9
File in questo prodotto:
File Dimensione Formato  
camera ready.pdf

accesso aperto

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 862.8 kB
Formato Adobe PDF
862.8 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2736932
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo