Vision transformers (ViTs) outperform convolutional neural networks (CNNs) in tasks such as image classification, and, despite their high computational complexity, can still be mapped to low-power EdgeAI accelerators, such as the Coral Tensor Processing Unit (TPU). In this paper, through accelerated neutron beam experiments, we study the reliability of six ViTs on the Coral TPU and four micro-benchmarks. According to our data, the internal size of attention heads (the main computational block in ViTs) has negligible impact on the FIT rate of the model compared to increasing the number of heads in the model; furthermore, our results show that employing convolutions in the patch embedding reduces the FIT rate of the model. Additionally, we decompose ViT into four basic computational blocks which represent the main operators of the model, showing that, although the transformer layer (with multi-head self-attention and multi-layer perceptron) presents the highest FIT rate, it is actually the patch embedding that is more likely to cause misclassifications. These results can be leveraged to design hardening techniques that improve the resilience of the critical blocks of a ViT, identified in our evaluation, while minimizing the additional overhead.
Vision Transformer Reliability Evaluation on the Coral Edge TPU / Coelho, Bruno Loureiro; Bodmann, Pablo R.; Cavagnero, Niccolo; Frost, Christopher; Rech, Paolo. - In: IEEE TRANSACTIONS ON NUCLEAR SCIENCE. - ISSN 0018-9499. - (2024). [10.1109/tns.2024.3513774]
Vision Transformer Reliability Evaluation on the Coral Edge TPU
Cavagnero, Niccolo;
2024
Abstract
Vision transformers (ViTs) outperform convolutional neural networks (CNNs) in tasks such as image classification, and, despite their high computational complexity, can still be mapped to low-power EdgeAI accelerators, such as the Coral Tensor Processing Unit (TPU). In this paper, through accelerated neutron beam experiments, we study the reliability of six ViTs on the Coral TPU and four micro-benchmarks. According to our data, the internal size of attention heads (the main computational block in ViTs) has negligible impact on the FIT rate of the model compared to increasing the number of heads in the model; furthermore, our results show that employing convolutions in the patch embedding reduces the FIT rate of the model. Additionally, we decompose ViT into four basic computational blocks which represent the main operators of the model, showing that, although the transformer layer (with multi-head self-attention and multi-layer perceptron) presents the highest FIT rate, it is actually the patch embedding that is more likely to cause misclassifications. These results can be leveraged to design hardening techniques that improve the resilience of the critical blocks of a ViT, identified in our evaluation, while minimizing the additional overhead.File | Dimensione | Formato | |
---|---|---|---|
Vision_Transformer_Reliability_Evaluation_on_the_Coral_Edge_TPU.pdf
accesso aperto
Descrizione: .pdf
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Creative commons
Dimensione
7.58 MB
Formato
Adobe PDF
|
7.58 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2995241