Quantized Neural Networks (QNNs) are increasingly employed to bring Machine Learning (ML) capabilities to edge devices by reducing memory and computational requirements through low-precision arithmetic. QNNs are generally considered more robust to hardware faults, but prior reliability studies, with rare exceptions, have largely relied on high-level fault injection (FI) methodologies that abstract away from hardware details. This paper presents a hardware-level fault injection methodology that integrates Verilator-generated gate-level simulations into QNN inference pipelines, enabling the injection of permanent faults directly into gate-level descriptions of functional units during inferences. Optimizations for small quantization bit widths are also presented to address performance limitations, for instance, leveraging lookup tables and GPU parallelization to speed up simulations. FI campaigns were conducted on multiple networks (MobileNet, MobileNetV2, ResNet18, ResNet34) trained on CIFAR-10 and GTSRB and quantized to INT8 (CIFAR-10 and GTSRB) and UINT8 (CIFAR-10), revealing severe accuracy degradation, averaging a -72.7% drop in classification accuracy, under single stuck faults in an integer multiplier. The results obtained confirm that QNNs may not be as robust as previously thought.

On the resilience of INT8 Quantized Neural Networks on Low-Power RISC-V Devices / Porsia, Antonio; Perlo, Giacomo; Ruospo, Annachiara; Sanchez, Ernesto. - (2025), pp. 119-122. (Intervento presentato al convegno AxC’25 The 10th Workshop on Approximate Computing tenutosi a Naples (Ita) nel June 23-26, 2025) [10.1109/DSN-W65791.2025.00049].

On the resilience of INT8 Quantized Neural Networks on Low-Power RISC-V Devices

Antonio Porsia;Giacomo Perlo;Annachiara Ruospo;Ernesto Sanchez
2025

Abstract

Quantized Neural Networks (QNNs) are increasingly employed to bring Machine Learning (ML) capabilities to edge devices by reducing memory and computational requirements through low-precision arithmetic. QNNs are generally considered more robust to hardware faults, but prior reliability studies, with rare exceptions, have largely relied on high-level fault injection (FI) methodologies that abstract away from hardware details. This paper presents a hardware-level fault injection methodology that integrates Verilator-generated gate-level simulations into QNN inference pipelines, enabling the injection of permanent faults directly into gate-level descriptions of functional units during inferences. Optimizations for small quantization bit widths are also presented to address performance limitations, for instance, leveraging lookup tables and GPU parallelization to speed up simulations. FI campaigns were conducted on multiple networks (MobileNet, MobileNetV2, ResNet18, ResNet34) trained on CIFAR-10 and GTSRB and quantized to INT8 (CIFAR-10 and GTSRB) and UINT8 (CIFAR-10), revealing severe accuracy degradation, averaging a -72.7% drop in classification accuracy, under single stuck faults in an integer multiplier. The results obtained confirm that QNNs may not be as robust as previously thought.
2025
979-8-3315-1205-7
File in questo prodotto:
File Dimensione Formato  
AxC2025.pdf

accesso aperto

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Pubblico - Tutti i diritti riservati
Dimensione 840.73 kB
Formato Adobe PDF
840.73 kB Adobe PDF Visualizza/Apri
On_the_Resilience_of_INT8_Quantized_Neural_Networks_on_Low-Power_RISC-V_Devices.pdf

accesso riservato

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 387.36 kB
Formato Adobe PDF
387.36 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3000821