In recent years, the demand for compact deep neural networks (DNN s) has increased consistently, driven by the necessity to deploy them in environments with limited resources such as mobile or embedded devices. Our work aims to tackle this challenge by proposing a combination of two techniques: sparsification and t ernarization o f network parameters. We extend the plain binarization by introducing a sparsification interval centered around O. The network parameters falling in this interval are set to 0 and effectively removed from the net-work topology. Despite the increased complexity required by the ternarization scheme compared to a binary quantizer, we obtain remarkable sparsity rates that yield parameter distri-butions with significantly compressible sources with entropy lower than 1 bits/symbol.

Sparsification of Deep Neural Networks via Ternary Quantization / Dordoni, Luca; Migliorati, Andrea; Fracastoro, Giulia; Fosson, Sophie; Bianchi, Tiziano; Magli, Enrico. - (2024), pp. 1-6. (Intervento presentato al convegno 34th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2024 tenutosi a London (UK) nel 2024) [10.1109/mlsp58920.2024.10734714].

Sparsification of Deep Neural Networks via Ternary Quantization

Dordoni, Luca;Migliorati, Andrea;Fracastoro, Giulia;Fosson, Sophie;Bianchi, Tiziano;Magli, Enrico
2024

Abstract

In recent years, the demand for compact deep neural networks (DNN s) has increased consistently, driven by the necessity to deploy them in environments with limited resources such as mobile or embedded devices. Our work aims to tackle this challenge by proposing a combination of two techniques: sparsification and t ernarization o f network parameters. We extend the plain binarization by introducing a sparsification interval centered around O. The network parameters falling in this interval are set to 0 and effectively removed from the net-work topology. Despite the increased complexity required by the ternarization scheme compared to a binary quantizer, we obtain remarkable sparsity rates that yield parameter distri-butions with significantly compressible sources with entropy lower than 1 bits/symbol.
2024
9798350372250
File in questo prodotto:
File Dimensione Formato  
Sparsification of Deep Neural Networks via Ternary Quantization_OA.pdf

accesso aperto

Descrizione: author's version
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Pubblico - Tutti i diritti riservati
Dimensione 1.56 MB
Formato Adobe PDF
1.56 MB Adobe PDF Visualizza/Apri
Sparsification_of_Deep_Neural_Networks_via_Ternary_Quantization.pdf

accesso riservato

Descrizione: published version
Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 1.63 MB
Formato Adobe PDF
1.63 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2995346