In the last decade, a huge effort has been spent on assessing the reliability of Convolutional Neural networks (CNNs), probably the most popular architecture for image classification tasks. However, modern Deep Neural Networks (DNNs) are rapidly overtaking CNNs, as state-of-the-art results for many tasks are achieved with the Transformers, innovative DNN models. Transformers' architecture introduces the concept of attention as an alternative to the classical convolution operation. The aim of this work is to propose a reliability analysis of the Swin Transformer, one of the most accurate DNN used for Image Classification, that greatly improves the results obtained by traditional CNNs. In particular, this paper shows that, similar to CNNs, Transformers are susceptible to single faults affecting weights and neurons. Furthermore, it is shown how output ranging, a well-known technique to reduce the impact of a fault in CNNs, is not as effective for the Transformer. The alternative solution proposed by this work is to introduce a ranging not only on the output, but also on the input and on the weight of the fully connected layers. Results show that, on average, the number of critical faults (i.e., that modify the network's output) affecting neurons decreases by a factor of 1.91, while for faults affecting the network's weights this value decreases by a factor of 1 * 10 ^ 5.

Evaluation and mitigation of faults affecting Swin Transformers / Gavarini, Gabriele; Ruospo, Annachiara; Sanchez, Ernesto. - ELETTRONICO. - (2023), pp. 1-7. (Intervento presentato al convegno 29th IEEE International Symposium on On-Line Testing and Robust System Design (IOLTS 2023) tenutosi a Chania,Crete (Greece) nel July 3rd - 5th, 2023) [10.1109/IOLTS59296.2023.10224882].

Evaluation and mitigation of faults affecting Swin Transformers

Gabriele Gavarini;Annachiara Ruospo;Ernesto Sanchez
2023

Abstract

In the last decade, a huge effort has been spent on assessing the reliability of Convolutional Neural networks (CNNs), probably the most popular architecture for image classification tasks. However, modern Deep Neural Networks (DNNs) are rapidly overtaking CNNs, as state-of-the-art results for many tasks are achieved with the Transformers, innovative DNN models. Transformers' architecture introduces the concept of attention as an alternative to the classical convolution operation. The aim of this work is to propose a reliability analysis of the Swin Transformer, one of the most accurate DNN used for Image Classification, that greatly improves the results obtained by traditional CNNs. In particular, this paper shows that, similar to CNNs, Transformers are susceptible to single faults affecting weights and neurons. Furthermore, it is shown how output ranging, a well-known technique to reduce the impact of a fault in CNNs, is not as effective for the Transformer. The alternative solution proposed by this work is to introduce a ranging not only on the output, but also on the input and on the weight of the fully connected layers. Results show that, on average, the number of critical faults (i.e., that modify the network's output) affecting neurons decreases by a factor of 1.91, while for faults affecting the network's weights this value decreases by a factor of 1 * 10 ^ 5.
2023
979-8-3503-4135-5
File in questo prodotto:
File Dimensione Formato  
IOLTS23_TransformerCaseStudy.pdf

accesso aperto

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 646.85 kB
Formato Adobe PDF
646.85 kB Adobe PDF Visualizza/Apri
Evaluation_and_Mitigation_of_Faults_Affecting_Swin_Transformers.pdf

non disponibili

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 1.35 MB
Formato Adobe PDF
1.35 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2980263