Multimodal interaction has emerged as a promising approach to enrich user experience and foster engagement in Educational Music Production (EMP) environments. Traditional audio interfaces and Digital Audio Workstations (DAWs), while powerful, often present engagement challenges, particularly for beginners, and tend to limit opportunities for spontaneous creativity and hands-on exploration during the learning process. To overcome these limitations, this paper introduces Soundy, a multimodal audio interface that integrates an on-board, lightweight web-based DAW. The system is accessible via a standard browser through a Local Area Network (LAN) and allows users to control audio production tasks using predefined and custom voice and facial commands. A comparative within-subject study was conducted involving 20 participants, each of whom tested both Soundy and a standard configuration composed of a Behringer UMC404HD audio interface and the Reaper DAW. The evaluation focused on two key dimensions, usability and engagement, measured using two standardized and validated instruments: The Post-Study System Usability Questionnaire (PSSUQ) and the User Engagement Scale - Short Form (UES-SF), complemented by open-ended participant feedback. Results revealed a trade-off between configurations: the standard setup ensured higher usability and task efficiency, while Soundy promoted greater engagement, creativity, and exploratory behavior. These findings suggest that embedded multimodal solutions based on voice and facial interaction hold strong potential for enhancing student experience in EMP and support future developments in adaptive command recognition and integrated hardware design.
Soundy: A Multimodal Audio Interface for Educational Music Production / Buccellato, Pietro; Bianco, Andrea; Rottondi, Cristina. - In: IEEE ACCESS. - ISSN 2169-3536. - 13:(2025), pp. 153105-153122. [10.1109/access.2025.3604631]
Soundy: A Multimodal Audio Interface for Educational Music Production
Buccellato, Pietro;Bianco, Andrea;Rottondi, Cristina
2025
Abstract
Multimodal interaction has emerged as a promising approach to enrich user experience and foster engagement in Educational Music Production (EMP) environments. Traditional audio interfaces and Digital Audio Workstations (DAWs), while powerful, often present engagement challenges, particularly for beginners, and tend to limit opportunities for spontaneous creativity and hands-on exploration during the learning process. To overcome these limitations, this paper introduces Soundy, a multimodal audio interface that integrates an on-board, lightweight web-based DAW. The system is accessible via a standard browser through a Local Area Network (LAN) and allows users to control audio production tasks using predefined and custom voice and facial commands. A comparative within-subject study was conducted involving 20 participants, each of whom tested both Soundy and a standard configuration composed of a Behringer UMC404HD audio interface and the Reaper DAW. The evaluation focused on two key dimensions, usability and engagement, measured using two standardized and validated instruments: The Post-Study System Usability Questionnaire (PSSUQ) and the User Engagement Scale - Short Form (UES-SF), complemented by open-ended participant feedback. Results revealed a trade-off between configurations: the standard setup ensured higher usability and task efficiency, while Soundy promoted greater engagement, creativity, and exploratory behavior. These findings suggest that embedded multimodal solutions based on voice and facial interaction hold strong potential for enhancing student experience in EMP and support future developments in adaptive command recognition and integrated hardware design.| File | Dimensione | Formato | |
|---|---|---|---|
|
Soundy_A_Multimodal_Audio_Interface_for_Educational_Music_Production.pdf
accesso aperto
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Creative commons
Dimensione
3.86 MB
Formato
Adobe PDF
|
3.86 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3006477
