Traditional speech recognition methods rely on software-based feature extraction that introduces latency and high energy costs, making them unsuitable for low-power devices. A proof-of-concept demonstration is provided of a bioinspired tonotopic sensor for speech recognition that mimics the human cochlea, using a spiral-shaped elastic metamaterial. The measured modal response of the structure at different frequencies generates a spatially distributed signal, providing a spatiotemporal map of the input named "tonogram". The device acts as an in-sensor physical reservoir computing system, working simultaneously as a sensor and as a computing unit, capable of extracting features of spoken words relevant to speech recognition. Results indicate that this can serve as a valid alternative to traditional software-based digital preprocessing, ensuring high accuracy in terms of classification, while reducing computational requirements. This work demonstrates the potential of bioinspired metamaterials for energy-efficient auditory sensing and, beyond speech recognition, for applications such as IoT devices and edge computing artificial intelligence systems.

Speech Recognition with Cochlea-Inspired In-Sensor Computing / Beoletto, P. H.; Milano, G.; Ricciardi, C.; Bosia, F.; Gliozzi, A. S.. - In: ADVANCED INTELLIGENT SYSTEMS. - ISSN 2640-4567. - ELETTRONICO. - (2025), pp. 1-12. [10.1002/aisy.202500526]

Speech Recognition with Cochlea-Inspired In-Sensor Computing

Beoletto P. H.;Milano G.;Ricciardi C.;Bosia F.;Gliozzi A. S.
2025

Abstract

Traditional speech recognition methods rely on software-based feature extraction that introduces latency and high energy costs, making them unsuitable for low-power devices. A proof-of-concept demonstration is provided of a bioinspired tonotopic sensor for speech recognition that mimics the human cochlea, using a spiral-shaped elastic metamaterial. The measured modal response of the structure at different frequencies generates a spatially distributed signal, providing a spatiotemporal map of the input named "tonogram". The device acts as an in-sensor physical reservoir computing system, working simultaneously as a sensor and as a computing unit, capable of extracting features of spoken words relevant to speech recognition. Results indicate that this can serve as a valid alternative to traditional software-based digital preprocessing, ensuring high accuracy in terms of classification, while reducing computational requirements. This work demonstrates the potential of bioinspired metamaterials for energy-efficient auditory sensing and, beyond speech recognition, for applications such as IoT devices and edge computing artificial intelligence systems.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3006060
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo