Human Activity Recognition (HAR) from inertial sensors must run continuously on battery-powered wearables under tight latency, memory, and energy budgets. While tiny Transformers can be effective on inertial time series, end-to-end co-design across quantized inference and heterogeneous low-power platforms remains underexplored. We present STORM (Small Transformer for On-node Recognition of Motion), a deployment-oriented 19.7k-parameter 1D Transformer co-designed with X-HEEP, an open-source low-power single-core RISC-V SoC, and a tightly coupled streaming CGRA for nonlinear primitives (e.g., softmax). We build a cross-source 8-class benchmark by harmonizing 3 public datasets under a stringent, deployment-aligned protocol that exposes both cross-subject and cross-source shift. Using 1.280 s windows with 0.640 s stride, the protocol models continuous on-node HAR under cross-dataset generalization. After quantization-aware training and INT8 C inference export, STORM achieves 0.799/0.801 accuracy/macro-F1 on this benchmark. Deployed on an FPGA prototype of X-HEEP with the streaming CGRA backend, STORM requires 67.4 ms per inference at 100 MHz, while activity-based power analysis estimates a total inference energy of 632.4 μJ, satisfying the stride-driven real-time constraint. These results support the practical viability of compact attention-based HAR on low-power wearable-class embedded platforms.

STORM: Hardware-Aware Tiny Transformer Co-Design for Low-Power Inertial Human Activity Recognition / Varaldi, Alessandro; Genta, Claudio; Manzone, Alberto; Vacca, Marco. - In: ELECTRONICS. - ISSN 2079-9292. - ELETTRONICO. - 15:9(2026). [10.3390/electronics15091924]

STORM: Hardware-Aware Tiny Transformer Co-Design for Low-Power Inertial Human Activity Recognition

Alessandro Varaldi;Marco Vacca
2026

Abstract

Human Activity Recognition (HAR) from inertial sensors must run continuously on battery-powered wearables under tight latency, memory, and energy budgets. While tiny Transformers can be effective on inertial time series, end-to-end co-design across quantized inference and heterogeneous low-power platforms remains underexplored. We present STORM (Small Transformer for On-node Recognition of Motion), a deployment-oriented 19.7k-parameter 1D Transformer co-designed with X-HEEP, an open-source low-power single-core RISC-V SoC, and a tightly coupled streaming CGRA for nonlinear primitives (e.g., softmax). We build a cross-source 8-class benchmark by harmonizing 3 public datasets under a stringent, deployment-aligned protocol that exposes both cross-subject and cross-source shift. Using 1.280 s windows with 0.640 s stride, the protocol models continuous on-node HAR under cross-dataset generalization. After quantization-aware training and INT8 C inference export, STORM achieves 0.799/0.801 accuracy/macro-F1 on this benchmark. Deployed on an FPGA prototype of X-HEEP with the streaming CGRA backend, STORM requires 67.4 ms per inference at 100 MHz, while activity-based power analysis estimates a total inference energy of 632.4 μJ, satisfying the stride-driven real-time constraint. These results support the practical viability of compact attention-based HAR on low-power wearable-class embedded platforms.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3010517
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo