Augmented and Virtual Reality (AR/VR) technologies are gaining popularity to improve healthcare professionals training, with precise eye tracking playing a crucial role in enhancing performance. However, these systems need to be both low-latency and low-power to operate in real-time scenarios on resource-constrained devices. Event-based cameras can be employed to address these requirements, as they offer energy-efficient, high temporal resolution data with minimal battery drain. However, their sparse data format necessitates specialized processing algorithms. In this work, we propose a data preprocessing technique that improves the performance of nonrecurrent Deep Neural Networks (DNNs) for pupil position estimation. With this approach, we integrate over time - with a leakage factor - multiple time surfaces of events, so that the input data is enriched with information from past events. Additionally, in order to better distinguish between recent and old information, we generate multiple memory channels characterized by different leakage/forgetting rates. These memory channels are fed to well-known non-recurrent neural estimators to predict the position of the pupil. As an example, by using time surfaces only and feeding them to a MobileNet-V3L model to track the pupil in DVS recordings, we achieve a P10 accuracy (Euclidean error lower than ten pixels) of 85.40%, whether by using memory channels we achieve a P10 accuracy of 94.37% with a negligible time overhead.
Memory in Motion: Exploring Leaky Integration of Time Surfaces for Event-based Eye-tracking / Boretti, Chiara; Bich, Philippe; Prono, Luciano; Pareschi, Fabio; Rovatti, Riccardo; Setti, Gianluca. - ELETTRONICO. - (2024), pp. 1-5. (Intervento presentato al convegno 2024 IEEE Biomedical Circuits and Systems Conference (BioCAS) tenutosi a Xian (China) nel 24-26 October 2024) [10.1109/biocas61083.2024.10798345].
Memory in Motion: Exploring Leaky Integration of Time Surfaces for Event-based Eye-tracking
Boretti, Chiara;Bich, Philippe;Prono, Luciano;Pareschi, Fabio;Setti, Gianluca
2024
Abstract
Augmented and Virtual Reality (AR/VR) technologies are gaining popularity to improve healthcare professionals training, with precise eye tracking playing a crucial role in enhancing performance. However, these systems need to be both low-latency and low-power to operate in real-time scenarios on resource-constrained devices. Event-based cameras can be employed to address these requirements, as they offer energy-efficient, high temporal resolution data with minimal battery drain. However, their sparse data format necessitates specialized processing algorithms. In this work, we propose a data preprocessing technique that improves the performance of nonrecurrent Deep Neural Networks (DNNs) for pupil position estimation. With this approach, we integrate over time - with a leakage factor - multiple time surfaces of events, so that the input data is enriched with information from past events. Additionally, in order to better distinguish between recent and old information, we generate multiple memory channels characterized by different leakage/forgetting rates. These memory channels are fed to well-known non-recurrent neural estimators to predict the position of the pupil. As an example, by using time surfaces only and feeding them to a MobileNet-V3L model to track the pupil in DVS recordings, we achieve a P10 accuracy (Euclidean error lower than ten pixels) of 85.40%, whether by using memory channels we achieve a P10 accuracy of 94.37% with a negligible time overhead.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2996076
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo