Pruning neural networks, i.e., removing some of their parameters whilst retaining their accuracy, is one of the main ways to reduce the latency of a machine learn- ing pipeline, especially in resource- and/or bandwidth- constrained scenarios. In this context, the pruning tech- nique, i.e., how to choose the parameters to remove, is crit- ical to the system performance. In this paper, we propose a novel pruning approach, called FlexRel and predicated upon combining training-time and inference-time information, namely, parameter magnitude and relevance, in order to improve the resulting accuracy whilst saving both computational resources and bandwidth. Our performance evaluation shows that FlexRel is able to achieve higher pruning factors, saving over 35% bandwidth for typical accuracy targets.
Combining Relevance and Magnitude for Resource-saving DNN Pruning / Chiasserini, C. F.; Malandrino, F.; Molner, N.; Zhao, Z.. - In: IEEE NETWORK. - ISSN 0890-8044. - (2025). [10.1109/MNET.2025.3556212]
Combining Relevance and Magnitude for Resource-saving DNN Pruning
C. F. Chiasserini;Z. Zhao
2025
Abstract
Pruning neural networks, i.e., removing some of their parameters whilst retaining their accuracy, is one of the main ways to reduce the latency of a machine learn- ing pipeline, especially in resource- and/or bandwidth- constrained scenarios. In this context, the pruning tech- nique, i.e., how to choose the parameters to remove, is crit- ical to the system performance. In this paper, we propose a novel pruning approach, called FlexRel and predicated upon combining training-time and inference-time information, namely, parameter magnitude and relevance, in order to improve the resulting accuracy whilst saving both computational resources and bandwidth. Our performance evaluation shows that FlexRel is able to achieve higher pruning factors, saving over 35% bandwidth for typical accuracy targets.File | Dimensione | Formato | |
---|---|---|---|
2025_POLITO_Netw_Mag_Pruning_Relevance.pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
298.56 kB
Formato
Adobe PDF
|
298.56 kB | Adobe PDF | Visualizza/Apri |
Combining_Relevance_and_Magnitude_for_Resource-saving_DNN_Pruning.pdf
accesso riservato
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
370.76 kB
Formato
Adobe PDF
|
370.76 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2998404