Optimizing Vision Transformers: Leveraging Max and Min Operations for Efficient Pruning

Bich, P.; Boretti, C.; Prono, L.; Pareschi, F.; Rovatti, R.; Setti, G.

doi:10.1109/AICAS59952.2024.10595859

The research on Deep Neural Networks (DNNs) continues to enhance the performance of these models over a wide spectrum of tasks, increasing their adoption in many fields. This leads to the need of extending their usage also on edge devices with limited resources, even though, with the advent of Transformer-based models, this has become an increasingly complex task because of their size. In this context, pruning emerges as a crucial tool to reduce the number of weights in the memory-hungry Fully Connected (FC) layers. This paper explores the usage of neurons based on the Multiply-And-Max/min (MAM) operation, an alternative to the conventional Multiply-and-Accumulate (MAC), in a Vision Transformer (ViT). This enhances the model prunability thanks to the usage of Max and Min operations. For the first time, many MAM-based FC layers are used in a large state-of-the-art DNN model and compressed with various pruning techniques available in the literature. Experiments show that MAM-based layers achieve the same accuracy of traditional layers using up to 12 times less weights. In particular, when using Global Magnitude Pruning (GMP), the FC layers following the Multi-head Attention block of a ViT-B/16 model, fine-tuned on CIFAR-100, count only 560000 weights if MAM neurons are used, compared to the 31.4 million that remain when using traditional MAC neurons.

Optimizing Vision Transformers: Leveraging Max and Min Operations for Efficient Pruning / Bich, P.; Boretti, C.; Prono, L.; Pareschi, F.; Rovatti, R.; Setti, G.. - STAMPA. - (2024), pp. 337-341. (Intervento presentato al convegno 2024 IEEE 6th International Conference on AI Circuits and Systems (AICAS) tenutosi a Abu Dhabi (United Arab Emirates) nel April 22-25, 2024) [10.1109/AICAS59952.2024.10595859].

Optimizing Vision Transformers: Leveraging Max and Min Operations for Efficient Pruning

Bich P.;Boretti C.;Prono L.;Pareschi F.;Rovatti R.;Setti G.

2024

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2024
			
	Codice ISBN
	
				979-8-3503-8363-8
			
	Appare nelle tipologie
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Optimizing_Vision_Transformers_Leveraging_Max_and_Min_Operations_for_Efficient_Pruning.pdf accesso riservato Descrizione: Editorial version Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 1.54 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.54 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
aicas2024.pdf accesso aperto Descrizione: author's version Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: Pubblico - Tutti i diritti riservati Dimensione 298.86 kB Formato Adobe PDF Visualizza/Apri	298.86 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2991795

Nome	Dominio	Durata	Descrizione
s_.*	plu.mx	sessione	recupero grafico citazioni sociali da plumx
A_.*	core.ac.uk	7 giorni	recupero pubblicazioni consigliate per il pannello core-recommander
GS_.*	gstatic.com	richiesta http	visualizza grafico citazioni
CC_.*	creativecommons.org	richiesta http	visualizza licenza bitstream

PORTO @ Archivio Istituzionale della Ricerca

Optimizing Vision Transformers: Leveraging Max and Min Operations for Efficient Pruning

Bich P.;Boretti C.;Prono L.;Pareschi F.;Rovatti R.;Setti G.

2024

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)