Voice interaction has rapidly become the preferred option for human-machine interactions, especially for domotics and wearable applications. In such contexts, the availability of accurate, yet resource-efficient keyword-spotting (KWS) systems for identifying user commands and queries is paramount. In this work, we focus on optimizing the development of KWS pipelines for voice user interfaces running on commercial embedded systems that can be deployed at the edge of the Internet of Things. Specifically, we investigate the joint optimization of the frontend and ConvNet stages of the KWS information processing pipeline, demonstrating how a holistic approach can significantly improve the quality of results of standard methods that apply optimizations on each stage of the pipeline independently. As the main contributions, we first define and discretize the design space to identify the most significant hyperparameters and then introduce a framework to search the dominant pipeline implementations in the accuracy-latency objective space. Porting and testing the explored solutions on an ARM Cortex-A72 CPU core embedded into a Raspberry Pi 4 board, the collected results show substantial improvements in accuracy (1.8% best-case, 0.82% average) and latency (best-case 27.36%, average 15.32%) compared to state-of-the-art approaches.

Concurrent Pipeline Stages Optimization for Embedded Keyword Spotting / Zema, Giacomo; Peluso, Valentino; Calimera, Andrea; Macii, Enrico. - (2023), pp. 0427-0433. (Intervento presentato al convegno World AI IoT Congress tenutosi a Seattle, WA (USA) nel 07-10 June 2023) [10.1109/AIIoT58121.2023.10174531].

Concurrent Pipeline Stages Optimization for Embedded Keyword Spotting

Zema, Giacomo;Peluso, Valentino;Calimera, Andrea;Macii, Enrico
2023

Abstract

Voice interaction has rapidly become the preferred option for human-machine interactions, especially for domotics and wearable applications. In such contexts, the availability of accurate, yet resource-efficient keyword-spotting (KWS) systems for identifying user commands and queries is paramount. In this work, we focus on optimizing the development of KWS pipelines for voice user interfaces running on commercial embedded systems that can be deployed at the edge of the Internet of Things. Specifically, we investigate the joint optimization of the frontend and ConvNet stages of the KWS information processing pipeline, demonstrating how a holistic approach can significantly improve the quality of results of standard methods that apply optimizations on each stage of the pipeline independently. As the main contributions, we first define and discretize the design space to identify the most significant hyperparameters and then introduce a framework to search the dominant pipeline implementations in the accuracy-latency objective space. Porting and testing the explored solutions on an ARM Cortex-A72 CPU core embedded into a Raspberry Pi 4 board, the collected results show substantial improvements in accuracy (1.8% best-case, 0.82% average) and latency (best-case 27.36%, average 15.32%) compared to state-of-the-art approaches.
2023
979-8-3503-3761-7
File in questo prodotto:
File Dimensione Formato  
Concurrent_Pipeline_Stages_Optimization_for_Embedded_Keyword_Spotting.pdf

non disponibili

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 1.78 MB
Formato Adobe PDF
1.78 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2984688