Quantized Neural Networks (QNNs) are increasingly employed to bring Machine Learning (ML) capabilities to edge devices by reducing memory and computational requirements through low-precision arithmetic. QNNs are generally considered more robust to hardware faults, but prior reliability studies, with rare exceptions, have largely relied on high-level fault injection (FI) methodologies that abstract away from hardware details. This paper presents a hardware-level fault injection methodology that integrates Verilator-generated gate-level simulations into QNN inference pipelines, enabling the injection of permanent faults directly into gate-level descriptions of functional units during inferences. Optimizations for small quantization bit widths are also presented to address performance limitations, for instance, leveraging lookup tables and GPU parallelization to speed up simulations. FI campaigns were conducted on multiple networks (MobileNet, MobileNetV2, ResNet18, ResNet34) trained on CIFAR-10 and GTSRB and quantized to INT8 (CIFAR-10 and GTSRB) and UINT8 (CIFAR-10), revealing severe accuracy degradation, averaging a -72.7% drop in classification accuracy, under single stuck faults in an integer multiplier. The results obtained confirm that QNNs may not be as robust as previously thought.
On the resilience of INT8 Quantized Neural Networks on Low-Power RISC-V Devices / Porsia, Antonio; Perlo, Giacomo; Ruospo, Annachiara; Sanchez, Ernesto. - (2025), pp. 119-122. (Intervento presentato al convegno AxC’25 The 10th Workshop on Approximate Computing tenutosi a Naples (Ita) nel June 23-26, 2025) [10.1109/DSN-W65791.2025.00049].
On the resilience of INT8 Quantized Neural Networks on Low-Power RISC-V Devices
Antonio Porsia;Giacomo Perlo;Annachiara Ruospo;Ernesto Sanchez
2025
Abstract
Quantized Neural Networks (QNNs) are increasingly employed to bring Machine Learning (ML) capabilities to edge devices by reducing memory and computational requirements through low-precision arithmetic. QNNs are generally considered more robust to hardware faults, but prior reliability studies, with rare exceptions, have largely relied on high-level fault injection (FI) methodologies that abstract away from hardware details. This paper presents a hardware-level fault injection methodology that integrates Verilator-generated gate-level simulations into QNN inference pipelines, enabling the injection of permanent faults directly into gate-level descriptions of functional units during inferences. Optimizations for small quantization bit widths are also presented to address performance limitations, for instance, leveraging lookup tables and GPU parallelization to speed up simulations. FI campaigns were conducted on multiple networks (MobileNet, MobileNetV2, ResNet18, ResNet34) trained on CIFAR-10 and GTSRB and quantized to INT8 (CIFAR-10 and GTSRB) and UINT8 (CIFAR-10), revealing severe accuracy degradation, averaging a -72.7% drop in classification accuracy, under single stuck faults in an integer multiplier. The results obtained confirm that QNNs may not be as robust as previously thought.File | Dimensione | Formato | |
---|---|---|---|
AxC2025.pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
840.73 kB
Formato
Adobe PDF
|
840.73 kB | Adobe PDF | Visualizza/Apri |
On_the_Resilience_of_INT8_Quantized_Neural_Networks_on_Low-Power_RISC-V_Devices.pdf
accesso riservato
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
387.36 kB
Formato
Adobe PDF
|
387.36 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3000821