Convolutional Neural Networks (CNNs) and Graphic Processing Units (GPUs) are now increasingly adopted in many cutting edge safety-critical applications. Consequently, it is crucial to evaluate the reliability of these systems, since the hardware can be affected by several phenomena (e.g., wear out of the device), producing permanent defects in the GPU. These defects may induce wrong outcomes in the CNN that may endanger the application. Traditionally, the study of the effects of permanent faults on CNNs has been approached by resorting to application-level fault injection (e.g., acting on the weights). However, this approach has restricted scope, and it may not reveal the actual vulnerabilities in the GPU device. Hence, a more accurate evaluation of the fault effects is required, considering more in-depth details of the device’s hardware. This work introduces a more elaborated experimental evaluation of the impact of GPU’s permanent faults on the reliability of a CNN by resorting to a Software-Implemented Fault Injection(SWIFI) strategy, considering faults at the hardware level. The results of the fault simulation campaigns we performed on the GPU data-path cores are compared with those at the application level, proving that the latter ones are generally optimistic.

Effective fault simulation of GPU’s permanent faults for reliability estimation of CNNs / Guerrero-Balaguera, Juan-David; Sierra, Robert Limas; Reorda, Matteo Sonza. - (2022), pp. 1-6. ((Intervento presentato al convegno 2022 IEEE 28th International Symposium on On-Line Testing and Robust System Design (IOLTS) tenutosi a Torino [10.1109/IOLTS56730.2022.9897823].

Effective fault simulation of GPU’s permanent faults for reliability estimation of CNNs

Guerrero-Balaguera, Juan-David;Reorda, Matteo Sonza
2022

Abstract

Convolutional Neural Networks (CNNs) and Graphic Processing Units (GPUs) are now increasingly adopted in many cutting edge safety-critical applications. Consequently, it is crucial to evaluate the reliability of these systems, since the hardware can be affected by several phenomena (e.g., wear out of the device), producing permanent defects in the GPU. These defects may induce wrong outcomes in the CNN that may endanger the application. Traditionally, the study of the effects of permanent faults on CNNs has been approached by resorting to application-level fault injection (e.g., acting on the weights). However, this approach has restricted scope, and it may not reveal the actual vulnerabilities in the GPU device. Hence, a more accurate evaluation of the fault effects is required, considering more in-depth details of the device’s hardware. This work introduces a more elaborated experimental evaluation of the impact of GPU’s permanent faults on the reliability of a CNN by resorting to a Software-Implemented Fault Injection(SWIFI) strategy, considering faults at the hardware level. The results of the fault simulation campaigns we performed on the GPU data-path cores are compared with those at the application level, proving that the latter ones are generally optimistic.
978-1-6654-7355-2
File in questo prodotto:
File Dimensione Formato  
Effective_fault_simulation_of_GPUs_permanent_faults_for_reliability_estimation_of_CNNs.pdf

non disponibili

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 1.43 MB
Formato Adobe PDF
1.43 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2972029