Graphics Processing Units (GPUs) are considered a promising solution for high-performance safety-critical applications, such as self-driving cars. In this application domain, the use of fault tolerance techniques is mandatory to detect or correct faults, since they must work properly even in the presence of faults. GPUs are designed with aggressive technology scaling, which makes them susceptible to faults caused by radiation interference, such as the Single Event Upsets (SEUs), which can lead the system to fail, and that is unacceptable in safety-critical applications. In this paper, we evaluate different software-based hardening techniques developed to detect SEUs in GPUs general-purpose registers and propose optimizations to improve performance and memory utilization. The techniques are implemented in three case-study applications and evaluated in a general-purpose soft-core GPU based on the NVIDIA G80 architecture. A fault injection campaign is performed at register transfer level to assess the fault detection potential of the implemented techniques. Results show that the proposed improvements can be tailored for different scenarios, helping engineers in navigating the design space of hardened GPGPU applications.

Evaluating Software-based Hardening Techniques for General-Purpose Registers on a GPGPU / Goncalves, Marcio M.; Azambuja, Jose Rodrigo; Rodriguez Condia, Josie E.; Sonza Reorda, Matteo; Sterpone, Luca. - ELETTRONICO. - (2020), pp. 1-6. (Intervento presentato al convegno 2020 IEEE Latin-American Test Symposium (LATS) tenutosi a Maceio, Brasil nel 30 March-2 April 2020) [10.1109/LATS49555.2020.9093682].

Evaluating Software-based Hardening Techniques for General-Purpose Registers on a GPGPU

Rodriguez Condia , Josie E.;Sonza Reorda, Matteo;Sterpone, Luca
2020

Abstract

Graphics Processing Units (GPUs) are considered a promising solution for high-performance safety-critical applications, such as self-driving cars. In this application domain, the use of fault tolerance techniques is mandatory to detect or correct faults, since they must work properly even in the presence of faults. GPUs are designed with aggressive technology scaling, which makes them susceptible to faults caused by radiation interference, such as the Single Event Upsets (SEUs), which can lead the system to fail, and that is unacceptable in safety-critical applications. In this paper, we evaluate different software-based hardening techniques developed to detect SEUs in GPUs general-purpose registers and propose optimizations to improve performance and memory utilization. The techniques are implemented in three case-study applications and evaluated in a general-purpose soft-core GPU based on the NVIDIA G80 architecture. A fault injection campaign is performed at register transfer level to assess the fault detection potential of the implemented techniques. Results show that the proposed improvements can be tailored for different scenarios, helping engineers in navigating the design space of hardened GPGPU applications.
2020
978-1-7281-8731-0
File in questo prodotto:
File Dimensione Formato  
LATS_2020_paper.pdf

accesso aperto

Descrizione: Post-print version of the manuscript
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 1.11 MB
Formato Adobe PDF
1.11 MB Adobe PDF Visualizza/Apri
09093682.pdf

non disponibili

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 180.67 kB
Formato Adobe PDF
180.67 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2827035