DNNs are highly memory and computationally intensive, due to which they are unfeasible to deploy in real time or mobile applications, where power and memory resources are scarce. Introducing sparsity in the network is a way to reduce those requirements. However, systematically employing pruning under given accuracy requirements is a challenging problem. We propose a novel methodology that iteratively applies a magnitude-based Class-Blind pruning to compress a DNN for obtaining a sparse model. It is a generic methodology and can be applied to different types of DNNs. We demonstrate that retraining after pruning is essential to restore the accuracy of the network. Experimental results show that our methodology is able to reduce the model size by around two orders of magnitude, without noticeably affecting the accuracy. It requires several iterations of pruning and retraining, but can achieve up to 190x Memory Saving Ratio (for the LeNet on the MNIST dataset) when compared to the baseline model. Similar results are also obtained for more complex networks like 91x for VGG-16 on the CIFAR100 dataset. If we combine this work with an efficient coding for sparse networks, like Compressed Sparse Column (CSC) or Compressed Sparse Row (CSR), we can obtain a reduced memory footprint. Our methodology can be complemented by other compression techniques, like weight sharing, quantization or fixed-point conversion, that allows to further reduce memory and computations.

PruNet: Class-Blind Pruning Method For Deep Neural Networks / Marchisio, Alberto; Abdullah Hanif, Muhammad; Martina, Maurizio; Shafique, Muhammad. - ELETTRONICO. - 1:(2018), pp. 1-8. (Intervento presentato al convegno International Joint Conference on Neural Networks tenutosi a Rio de Janeiro (BR) nel 8-13 luglio 2018) [10.1109/IJCNN.2018.8489764].

PruNet: Class-Blind Pruning Method For Deep Neural Networks

Maurizio Martina;
2018

Abstract

DNNs are highly memory and computationally intensive, due to which they are unfeasible to deploy in real time or mobile applications, where power and memory resources are scarce. Introducing sparsity in the network is a way to reduce those requirements. However, systematically employing pruning under given accuracy requirements is a challenging problem. We propose a novel methodology that iteratively applies a magnitude-based Class-Blind pruning to compress a DNN for obtaining a sparse model. It is a generic methodology and can be applied to different types of DNNs. We demonstrate that retraining after pruning is essential to restore the accuracy of the network. Experimental results show that our methodology is able to reduce the model size by around two orders of magnitude, without noticeably affecting the accuracy. It requires several iterations of pruning and retraining, but can achieve up to 190x Memory Saving Ratio (for the LeNet on the MNIST dataset) when compared to the baseline model. Similar results are also obtained for more complex networks like 91x for VGG-16 on the CIFAR100 dataset. If we combine this work with an efficient coding for sparse networks, like Compressed Sparse Column (CSC) or Compressed Sparse Row (CSR), we can obtain a reduced memory footprint. Our methodology can be complemented by other compression techniques, like weight sharing, quantization or fixed-point conversion, that allows to further reduce memory and computations.
2018
978-1-5090-6014-6
File in questo prodotto:
File Dimensione Formato  
PruNet_sumbitted.pdf

accesso aperto

Descrizione: Articolo submitted
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 2.14 MB
Formato Adobe PDF
2.14 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2715856
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo