Pruning neural networks, i.e., removing some of their parameters whilst retaining their accuracy, is one of the main ways to reduce the latency of a machine learn- ing pipeline, especially in resource- and/or bandwidth- constrained scenarios. In this context, the pruning tech- nique, i.e., how to choose the parameters to remove, is crit- ical to the system performance. In this paper, we propose a novel pruning approach, called FlexRel and predicated upon combining training-time and inference-time information, namely, parameter magnitude and relevance, in order to improve the resulting accuracy whilst saving both computational resources and bandwidth. Our performance evaluation shows that FlexRel is able to achieve higher pruning factors, saving over 35% bandwidth for typical accuracy targets.

Combining Relevance and Magnitude for Resource-saving DNN Pruning / Chiasserini, C. F.; Malandrino, F.; Molner, N.; Zhao, Z.. - In: IEEE NETWORK. - ISSN 0890-8044. - (2025). [10.1109/MNET.2025.3556212]

Combining Relevance and Magnitude for Resource-saving DNN Pruning

C. F. Chiasserini;Z. Zhao
2025

Abstract

Pruning neural networks, i.e., removing some of their parameters whilst retaining their accuracy, is one of the main ways to reduce the latency of a machine learn- ing pipeline, especially in resource- and/or bandwidth- constrained scenarios. In this context, the pruning tech- nique, i.e., how to choose the parameters to remove, is crit- ical to the system performance. In this paper, we propose a novel pruning approach, called FlexRel and predicated upon combining training-time and inference-time information, namely, parameter magnitude and relevance, in order to improve the resulting accuracy whilst saving both computational resources and bandwidth. Our performance evaluation shows that FlexRel is able to achieve higher pruning factors, saving over 35% bandwidth for typical accuracy targets.
2025
File in questo prodotto:
File Dimensione Formato  
2025_POLITO_Netw_Mag_Pruning_Relevance.pdf

accesso aperto

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Pubblico - Tutti i diritti riservati
Dimensione 298.56 kB
Formato Adobe PDF
298.56 kB Adobe PDF Visualizza/Apri
Combining_Relevance_and_Magnitude_for_Resource-saving_DNN_Pruning.pdf

accesso riservato

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 370.76 kB
Formato Adobe PDF
370.76 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2998404