The reliability assessment of systems powered by artificial intelligence (AI) is becoming a crucial step prior to their deployment in safety and mission-critical systems. Recently, many efforts have been made to develop sophisticated techniques to evaluate and improve the resilience of AI models against the occurrence of random hardware faults. However, due to the intrinsic nature of such models, the comparison of the results obtained in state-of-the-art works is crucial, as reference models are missing. Moreover, their resilience is strongly influenced by the training process, the adopted framework and data representation, and so on. To enable a common ground for future research targeting Convolutional Neural Networks (CNNs) resilience analysis/hardening, this work proposes a first benchmark suite of Deep Learning (DL) models commonly adopted in this context, providing the models, the training/test data, and the resilience-related information (fault list, coverage, etc.) that can be used as a baseline for fair comparison. To this end, this research identifies a set of axes that have an impact on the resilience and classifies some popular CNN models, in both PyTorch and TensorFlow. Some final considerations are drawn, showing the relevance of a benchmark suite tailored for the resilience context.
Benchmark Suite for Resilience Assessment of Deep Learning Models / Bolchini, Cristiana; Bosio, Alberto; Cassano, Luca; Miele, Antonio; Pappalardo, Salvatore; Passarello, Dario; Ruospo, Annachiara; Sanchez, Ernesto; Sonza Reorda, Matteo; Turco, Vittorio. - In: IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS. - ISSN 0278-0070. - ELETTRONICO. - (2025). [10.1109/TCAD.2025.3578297]
Benchmark Suite for Resilience Assessment of Deep Learning Models
Annachiara Ruospo;Ernesto Sanchez;Matteo Sonza Reorda;Vittorio Turco
2025
Abstract
The reliability assessment of systems powered by artificial intelligence (AI) is becoming a crucial step prior to their deployment in safety and mission-critical systems. Recently, many efforts have been made to develop sophisticated techniques to evaluate and improve the resilience of AI models against the occurrence of random hardware faults. However, due to the intrinsic nature of such models, the comparison of the results obtained in state-of-the-art works is crucial, as reference models are missing. Moreover, their resilience is strongly influenced by the training process, the adopted framework and data representation, and so on. To enable a common ground for future research targeting Convolutional Neural Networks (CNNs) resilience analysis/hardening, this work proposes a first benchmark suite of Deep Learning (DL) models commonly adopted in this context, providing the models, the training/test data, and the resilience-related information (fault list, coverage, etc.) that can be used as a baseline for fair comparison. To this end, this research identifies a set of axes that have an impact on the resilience and classifies some popular CNN models, in both PyTorch and TensorFlow. Some final considerations are drawn, showing the relevance of a benchmark suite tailored for the resilience context.File | Dimensione | Formato | |
---|---|---|---|
Benchmark_Suite_for_Resilience_Assessment_of_Deep_Learning_Models.pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Creative commons
Dimensione
5.21 MB
Formato
Adobe PDF
|
5.21 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3000317