In response to the increasing interest in Internet of Things (IoT) applications, several studies explore ways to reduce the size of Deep Neural Networks (DNNs), to allow implementations on edge devices with strongly constrained resources. To this aim, pruning allows removing redundant interconnections between neurons, thus reducing a DNN memory footprint and computational complexity, while also minimizing the performance loss. Over the last years, many works presenting new pruning techniques and prunable architectures have been proposed but relatively little effort has been devoted to implementing and validating their performance on hardware. Recently, we introduced neurons based on the Multiply-And-Maximin (MAM) map-reduce paradigm. When state-of-the-art unstructured pruning techniques are applied, MAM-based neurons have shown better pruning capabilities compared to standard neurons based on the Multiply and Accumulate (MAC) paradigm. In this work, we implement MAM on-device for the first time to demonstrate the feasibility of MAM-based DNNs at the Edge. In particular, as a case study, we implement an autoencoder for electrocardiogram (ECG) signals on a low-end microcontroller unit (MCU), namely the STM32F767ZI based on ARM Cortex-M7. We show that the tail of a pruned MAM-based autoencoder fits on the targeted device while keeping a good reconstruction accuracy (Average Signal to Noise Ratio of 32.6 dB), where a standard MAC-based implementation with the same accuracy would not. Furthermore, the implemented MAM-based layer guarantees a lower energy consumption and inference time compared to the MAC-based layer at the same level of performance.
Multiply-And-Max/min Neurons at the Edge: Pruned Autoencoder Implementation / Bich, Philippe; Prono, Luciano; Mangia, Mauro; Pareschi, Fabio; Rovatti, Riccardo; Setti, Gianluca. - ELETTRONICO. - (2023), pp. 629-633. (Intervento presentato al convegno 2023 IEEE 66th International Midwest Symposium on Circuits and Systems (MWSCAS) tenutosi a Tempe, AZ, USA nel August 6-9, 2023) [10.1109/MWSCAS57524.2023.10405867].
Multiply-And-Max/min Neurons at the Edge: Pruned Autoencoder Implementation
Bich, Philippe;Prono, Luciano;Pareschi, Fabio;Setti, Gianluca
2023
Abstract
In response to the increasing interest in Internet of Things (IoT) applications, several studies explore ways to reduce the size of Deep Neural Networks (DNNs), to allow implementations on edge devices with strongly constrained resources. To this aim, pruning allows removing redundant interconnections between neurons, thus reducing a DNN memory footprint and computational complexity, while also minimizing the performance loss. Over the last years, many works presenting new pruning techniques and prunable architectures have been proposed but relatively little effort has been devoted to implementing and validating their performance on hardware. Recently, we introduced neurons based on the Multiply-And-Maximin (MAM) map-reduce paradigm. When state-of-the-art unstructured pruning techniques are applied, MAM-based neurons have shown better pruning capabilities compared to standard neurons based on the Multiply and Accumulate (MAC) paradigm. In this work, we implement MAM on-device for the first time to demonstrate the feasibility of MAM-based DNNs at the Edge. In particular, as a case study, we implement an autoencoder for electrocardiogram (ECG) signals on a low-end microcontroller unit (MCU), namely the STM32F767ZI based on ARM Cortex-M7. We show that the tail of a pruned MAM-based autoencoder fits on the targeted device while keeping a good reconstruction accuracy (Average Signal to Noise Ratio of 32.6 dB), where a standard MAC-based implementation with the same accuracy would not. Furthermore, the implemented MAM-based layer guarantees a lower energy consumption and inference time compared to the MAC-based layer at the same level of performance.File | Dimensione | Formato | |
---|---|---|---|
MWSCAS2023-Pruning.pdf
accesso aperto
Descrizione: Author's version
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
381.64 kB
Formato
Adobe PDF
|
381.64 kB | Adobe PDF | Visualizza/Apri |
Multiply-And-Max_min_Neurons_at_the_Edge_Pruned_Autoencoder_Implementation.pdf
accesso riservato
Descrizione: Editorial version
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
390.13 kB
Formato
Adobe PDF
|
390.13 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2985907