Vision transformers (ViTs) outperform convolutional neural networks (CNNs) in tasks such as image classification, and, despite their high computational complexity, can still be mapped to low-power EdgeAI accelerators, such as the Coral Tensor Processing Unit (TPU). In this paper, through accelerated neutron beam experiments, we study the reliability of six ViTs on the Coral TPU and four micro-benchmarks. According to our data, the internal size of attention heads (the main computational block in ViTs) has negligible impact on the FIT rate of the model compared to increasing the number of heads in the model; furthermore, our results show that employing convolutions in the patch embedding reduces the FIT rate of the model. Additionally, we decompose ViT into four basic computational blocks which represent the main operators of the model, showing that, although the transformer layer (with multi-head self-attention and multi-layer perceptron) presents the highest FIT rate, it is actually the patch embedding that is more likely to cause misclassifications. These results can be leveraged to design hardening techniques that improve the resilience of the critical blocks of a ViT, identified in our evaluation, while minimizing the additional overhead.

Vision Transformer Reliability Evaluation on the Coral Edge TPU / Coelho, Bruno Loureiro; Bodmann, Pablo R.; Cavagnero, Niccolo; Frost, Christopher; Rech, Paolo. - In: IEEE TRANSACTIONS ON NUCLEAR SCIENCE. - ISSN 0018-9499. - (2024). [10.1109/tns.2024.3513774]

Vision Transformer Reliability Evaluation on the Coral Edge TPU

Cavagnero, Niccolo;
2024

Abstract

Vision transformers (ViTs) outperform convolutional neural networks (CNNs) in tasks such as image classification, and, despite their high computational complexity, can still be mapped to low-power EdgeAI accelerators, such as the Coral Tensor Processing Unit (TPU). In this paper, through accelerated neutron beam experiments, we study the reliability of six ViTs on the Coral TPU and four micro-benchmarks. According to our data, the internal size of attention heads (the main computational block in ViTs) has negligible impact on the FIT rate of the model compared to increasing the number of heads in the model; furthermore, our results show that employing convolutions in the patch embedding reduces the FIT rate of the model. Additionally, we decompose ViT into four basic computational blocks which represent the main operators of the model, showing that, although the transformer layer (with multi-head self-attention and multi-layer perceptron) presents the highest FIT rate, it is actually the patch embedding that is more likely to cause misclassifications. These results can be leveraged to design hardening techniques that improve the resilience of the critical blocks of a ViT, identified in our evaluation, while minimizing the additional overhead.
File in questo prodotto:
File Dimensione Formato  
Vision_Transformer_Reliability_Evaluation_on_the_Coral_Edge_TPU.pdf

accesso aperto

Descrizione: .pdf
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Creative commons
Dimensione 7.58 MB
Formato Adobe PDF
7.58 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2995241