Recent hearing research has benefitted from the latest Virtual Reality systems that allowed the reproduction of immersive Audio-Visual scenarios to achieve more ecological listening tests. Indeed, efforts have been spent to identify the aspects that convey actual ecological validity, particularly investigating the effects of visual cues and self-motion on Speech Intelligibility through tests mainly based on simulated scenes. However, work must still be addressed when sceneries developed through real recordings inside reverberant environments are concerned. This study used 3rd-order ambisonics recordings and stereoscopic 360° videos inside a reverberant conference hall to create three virtual audio-visual scenes where speech intelligibility tests were performed, introducing informational noise from different angles. A 16-speaker spherical array synced with a head-mounted display was used to administer the immersive tests to 50 normal-hearing subjects. Firstly, tests only composed of the auditory scenes were compared, based on the achieved scores, with tests also providing contextual and positional source-related visual cues, both with and without self-motion, for a total of four different test configurations. Then, to complete the investigation of the visual cues' impact on speech intelligibility, ten normal-hearing subjects were recruited to perform audio-visual tests incorporating lip-sync-related visual cues for the target speech.

Impact of contextual and lip-sync-related visual cues on speech intelligibility through immersive audio-visual scene recordings in a reverberant conference room / Guastamacchia, Angela; Galletto, Andrea; Riente, Fabrizio; Shtrepi, Louena; Puglisi, Giuseppina Emma; Albera, Andrea; Pellerey, Franco; Astolfi, Arianna. - ELETTRONICO. - (2024). (Intervento presentato al convegno INTER-NOISE 2024).

Impact of contextual and lip-sync-related visual cues on speech intelligibility through immersive audio-visual scene recordings in a reverberant conference room

Guastamacchia,Angela;Galletto,Andrea;Riente,Fabrizio;Shtrepi,Louena;Puglisi,Giuseppina Emma;Albera,Andrea;Pellerey,Franco;Astolfi,Arianna
2024

Abstract

Recent hearing research has benefitted from the latest Virtual Reality systems that allowed the reproduction of immersive Audio-Visual scenarios to achieve more ecological listening tests. Indeed, efforts have been spent to identify the aspects that convey actual ecological validity, particularly investigating the effects of visual cues and self-motion on Speech Intelligibility through tests mainly based on simulated scenes. However, work must still be addressed when sceneries developed through real recordings inside reverberant environments are concerned. This study used 3rd-order ambisonics recordings and stereoscopic 360° videos inside a reverberant conference hall to create three virtual audio-visual scenes where speech intelligibility tests were performed, introducing informational noise from different angles. A 16-speaker spherical array synced with a head-mounted display was used to administer the immersive tests to 50 normal-hearing subjects. Firstly, tests only composed of the auditory scenes were compared, based on the achieved scores, with tests also providing contextual and positional source-related visual cues, both with and without self-motion, for a total of four different test configurations. Then, to complete the investigation of the visual cues' impact on speech intelligibility, ten normal-hearing subjects were recruited to perform audio-visual tests incorporating lip-sync-related visual cues for the target speech.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2993217
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo