Traditional speech recognition methods rely on software-based feature extraction that introduces latency and high energy costs, making them unsuitable for low-power devices. A proof-of-concept demonstration is provided of a bioinspired tonotopic sensor for speech recognition that mimics the human cochlea, using a spiral-shaped elastic metamaterial. The measured modal response of the structure at different frequencies generates a spatially distributed signal, providing a spatiotemporal map of the input named "tonogram". The device acts as an in-sensor physical reservoir computing system, working simultaneously as a sensor and as a computing unit, capable of extracting features of spoken words relevant to speech recognition. Results indicate that this can serve as a valid alternative to traditional software-based digital preprocessing, ensuring high accuracy in terms of classification, while reducing computational requirements. This work demonstrates the potential of bioinspired metamaterials for energy-efficient auditory sensing and, beyond speech recognition, for applications such as IoT devices and edge computing artificial intelligence systems.
Speech Recognition with Cochlea-Inspired In-Sensor Computing / Beoletto, P. H.; Milano, G.; Ricciardi, C.; Bosia, F.; Gliozzi, A. S.. - In: ADVANCED INTELLIGENT SYSTEMS. - ISSN 2640-4567. - ELETTRONICO. - (2025), pp. 1-12. [10.1002/aisy.202500526]
Speech Recognition with Cochlea-Inspired In-Sensor Computing
Beoletto P. H.;Milano G.;Ricciardi C.;Bosia F.;Gliozzi A. S.
2025
Abstract
Traditional speech recognition methods rely on software-based feature extraction that introduces latency and high energy costs, making them unsuitable for low-power devices. A proof-of-concept demonstration is provided of a bioinspired tonotopic sensor for speech recognition that mimics the human cochlea, using a spiral-shaped elastic metamaterial. The measured modal response of the structure at different frequencies generates a spatially distributed signal, providing a spatiotemporal map of the input named "tonogram". The device acts as an in-sensor physical reservoir computing system, working simultaneously as a sensor and as a computing unit, capable of extracting features of spoken words relevant to speech recognition. Results indicate that this can serve as a valid alternative to traditional software-based digital preprocessing, ensuring high accuracy in terms of classification, while reducing computational requirements. This work demonstrates the potential of bioinspired metamaterials for energy-efficient auditory sensing and, beyond speech recognition, for applications such as IoT devices and edge computing artificial intelligence systems.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3006060
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo
