This research investigates the crucial integration of Neural Network (NN) models with the architecture of the hardware (HW) accelerator. Unlike existing approaches overlooking this interaction, we emphasize understanding the accelerator Datapath for reliability-focused algorithmic solutions. Focusing on Systolic Arrays Datapath, we theoretically evaluate the fault propagation from the HW layer to the NN. This analysis identifies variations in fault effects linked to various data mapping strategies. Considering the fault propagation model, we propose a novel reliability-oriented mapping strategy to mitigate fault effects based on resource rotation. Validation through HW fault injection demonstrates that an architecture-aware NN implementation reduces the impact of faults by up to 40%. Moreover, experimental results indicate that our proposed solution enhances the NN resilience, resulting in up to a 30% reduction in the error rate. Most importantly, these enhancements are attained without introducing performance or hardware overhead.
ZOR: Zero Overhead Reliability Strategies for AI Accelerators / Vacca, Eleonora; Azimi, Sarah; Sterpone, Luca. - ELETTRONICO. - (2024), pp. 248-252. (Intervento presentato al convegno 22nd IEEE International NEWCAS Conference 2024 tenutosi a Sherbrooke (CAN) nel 16-19 June 2024) [10.1109/NewCAS58973.2024.10666350].
ZOR: Zero Overhead Reliability Strategies for AI Accelerators
Vacca, Eleonora;Azimi, Sarah;Sterpone, Luca
2024
Abstract
This research investigates the crucial integration of Neural Network (NN) models with the architecture of the hardware (HW) accelerator. Unlike existing approaches overlooking this interaction, we emphasize understanding the accelerator Datapath for reliability-focused algorithmic solutions. Focusing on Systolic Arrays Datapath, we theoretically evaluate the fault propagation from the HW layer to the NN. This analysis identifies variations in fault effects linked to various data mapping strategies. Considering the fault propagation model, we propose a novel reliability-oriented mapping strategy to mitigate fault effects based on resource rotation. Validation through HW fault injection demonstrates that an architecture-aware NN implementation reduces the impact of faults by up to 40%. Moreover, experimental results indicate that our proposed solution enhances the NN resilience, resulting in up to a 30% reduction in the error rate. Most importantly, these enhancements are attained without introducing performance or hardware overhead.File | Dimensione | Formato | |
---|---|---|---|
newcas_24_pdfXpress.pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
PUBBLICO - Tutti i diritti riservati
Dimensione
337.9 kB
Formato
Adobe PDF
|
337.9 kB | Adobe PDF | Visualizza/Apri |
ZOR_Zero_Overhead_Reliability_Strategies_for_AI_Accelerators.pdf
non disponibili
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
420.06 kB
Formato
Adobe PDF
|
420.06 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2990347