With many devices deployed at the extreme edge in dynamic environments the ability to learn continually on the device is a fast-emerging trend for ultra-low-power Microcontrollers (MCUs). The key challenge in enabling Continual Learning (CL) on highly constrained MCUs is to curtail memory and computational requirements. This paper proposes a novel CL strategy based on sparse weight updates coupled with Latent Replay. We reduce the latency and memory requirements of the backpropagation algorithm by computing structured sparse update tensors for the trainable parameters retaining only partial activations during the forward pass and limiting the per-layer gradient computation to a subset of channels. When applied to lightweight Deep Neural Network (DNN) models for image classification namely PhiNet and MobileNetV2 our method can reduce up to 1.3x the memory and computation costs of the backpropagation algorithm with a minor accuracy drop (2%). Furthermore we evaluate the accuracy-latency-memory trade-off targeting a class-incremental CL setup on a RISC-V multi-core MCU. The proposed approach allows to learn on-device a new class-incremental task composed of two unseen classes in 18 min with 4.63 MB considering the most demanding configuration i.e. a MobileNetV2 trained on the CORe50 dataset.
Structured Sparse Back-propagation for Lightweight On-Device Continual Learning on Microcontroller Units / Paissan, Francesco; Nadalini, Davide; Rusci, Manuele; Ancilotto, Alberto; Conti, Francesco; Benini, Luca; Farella, Elisabetta. - (2024), pp. 2172-2181. (Intervento presentato al convegno 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) tenutosi a Seattle, WA (USA) nel 17-18 June 2024) [10.1109/CVPRW63382.2024.00222].
Structured Sparse Back-propagation for Lightweight On-Device Continual Learning on Microcontroller Units
Davide Nadalini;Francesco Conti;Luca Benini;Elisabetta Farella
2024
Abstract
With many devices deployed at the extreme edge in dynamic environments the ability to learn continually on the device is a fast-emerging trend for ultra-low-power Microcontrollers (MCUs). The key challenge in enabling Continual Learning (CL) on highly constrained MCUs is to curtail memory and computational requirements. This paper proposes a novel CL strategy based on sparse weight updates coupled with Latent Replay. We reduce the latency and memory requirements of the backpropagation algorithm by computing structured sparse update tensors for the trainable parameters retaining only partial activations during the forward pass and limiting the per-layer gradient computation to a subset of channels. When applied to lightweight Deep Neural Network (DNN) models for image classification namely PhiNet and MobileNetV2 our method can reduce up to 1.3x the memory and computation costs of the backpropagation algorithm with a minor accuracy drop (2%). Furthermore we evaluate the accuracy-latency-memory trade-off targeting a class-incremental CL setup on a RISC-V multi-core MCU. The proposed approach allows to learn on-device a new class-incremental task composed of two unseen classes in 18 min with 4.63 MB considering the most demanding configuration i.e. a MobileNetV2 trained on the CORe50 dataset.File | Dimensione | Formato | |
---|---|---|---|
Structured_Sparse_Back-propagation_for_Lightweight_On-Device_Continual_Learning_on_Microcontroller_Units.pdf
accesso riservato
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
3.68 MB
Formato
Adobe PDF
|
3.68 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Paissan_Structured_Sparse_Back-propagation_for_Lightweight_On-Device_Continual_Learning_on_Microcontroller_CVPRW_2024_paper.pdf
accesso aperto
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
3.11 MB
Formato
Adobe PDF
|
3.11 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2993215