Temporal Convolutional Networks (TCNs) are emerging lightweight Deep Learning models for Time Series analysis. We introduce an automated exploration approach and a library of optimized kernels to map TCNs on Parallel Ultra-Low Power (PULP) microcontrollers. Our approach minimizes latency and energy by exploiting a layer tiling optimizer to jointly find the tiling dimensions and select among alternative implementations of the causal and dilated 1D-convolution operations at the core of TCNs. We benchmark our approach on a commercial PULP device, achieving up to 103 imes lower latency and 20.3 imes lower energy than the Cube-AI toolkit executed on the STM32L4 and from 2.9 imes to 26.6 imes lower energy compared to commercial closed-source and academic open-source approaches on the same hardware target.

TCN Mapping Optimization for Ultra-Low Power Time-Series Edge Inference / Burrello, Alessio; Dequino, Alberto; Jahier Pagliari, Daniele; Conti, Franceso; Zanghieri, Marcello; Macii, Enrico; Benini, Luca; Poncino, Massimo. - ELETTRONICO. - 2021:(2021), pp. 1-6. (Intervento presentato al convegno 2021 IEEE/ACM International Symposium on Low Power Electronics and Design, ISLPED 2021 tenutosi a USA nel 2021) [10.1109/ISLPED52811.2021.9502494].

TCN Mapping Optimization for Ultra-Low Power Time-Series Edge Inference

Burrello, Alessio;Dequino, Alberto;Jahier Pagliari, Daniele;Macii, Enrico;Poncino, Massimo
2021

Abstract

Temporal Convolutional Networks (TCNs) are emerging lightweight Deep Learning models for Time Series analysis. We introduce an automated exploration approach and a library of optimized kernels to map TCNs on Parallel Ultra-Low Power (PULP) microcontrollers. Our approach minimizes latency and energy by exploiting a layer tiling optimizer to jointly find the tiling dimensions and select among alternative implementations of the causal and dilated 1D-convolution operations at the core of TCNs. We benchmark our approach on a commercial PULP device, achieving up to 103 imes lower latency and 20.3 imes lower energy than the Cube-AI toolkit executed on the STM32L4 and from 2.9 imes to 26.6 imes lower energy compared to commercial closed-source and academic open-source approaches on the same hardware target.
2021
978-1-6654-3922-0
File in questo prodotto:
File Dimensione Formato  
ISPLED21___TCN_Library.pdf

accesso aperto

Descrizione: Articolo principale (post-print)
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Pubblico - Tutti i diritti riservati
Dimensione 672.96 kB
Formato Adobe PDF
672.96 kB Adobe PDF Visualizza/Apri
TCN_Mapping_Optimization_for_Ultra-Low_Power_Time-Series_Edge_Inference.pdf

accesso riservato

Descrizione: Articolo principale (versione editoriale)
Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 766.81 kB
Formato Adobe PDF
766.81 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2924896