Virtual humans are computer-generated characters designed for believable interaction across extended reality applications such as gaming, training, and therapy. Virtual humans’ believability, which contributes to more immersive and engaging virtual experiences, is significantly enhanced by emotions in the form of facial expressions, vocal prosody, and expressive body language. Currently, creating emotional animations with a focus on body language, involves labor-intensive manual processes or costly motion-capture techniques. Text-to-motion synthesis leverages artificial intelligence to generate animations from textual prompts, offering a promising alternative to simplify animation workflows. However, existing models often prioritize physical realism over emotional expressiveness, an aspect that remains underexplored. This study evaluates emotional believability in animations generated by four text-to-motion models (LADiff, MDM, T2MGPT, Muse Animate). A user study, which involved 39 participants, assessed basic emotions portrayed during common actions, revealing that emotions such as anger and sadness were most recognizable via body language alone, while surprise and disgust posed greater challenges. These results align with previous body language research, emphasizing the strengths and limitations of AI generated emotional animations and offering insights to enhance virtual human expressiveness.
From Words to Emotions : Evaluating Text-to-Motion Body Language for Believable Emotions in Virtual Humans / Calzolari, Stefano; Annicchiarico, Ciro; Strada, Francesco; Bottino, Andrea. - STAMPA. - (2025). (Intervento presentato al convegno International Conference on eXtended Reality (XR Salento) 2025 tenutosi a Otranto, Italia).
From Words to Emotions : Evaluating Text-to-Motion Body Language for Believable Emotions in Virtual Humans
Calzolari,Stefano;Annicchiarico,Ciro;Strada,Francesco;Bottino,Andrea
2025
Abstract
Virtual humans are computer-generated characters designed for believable interaction across extended reality applications such as gaming, training, and therapy. Virtual humans’ believability, which contributes to more immersive and engaging virtual experiences, is significantly enhanced by emotions in the form of facial expressions, vocal prosody, and expressive body language. Currently, creating emotional animations with a focus on body language, involves labor-intensive manual processes or costly motion-capture techniques. Text-to-motion synthesis leverages artificial intelligence to generate animations from textual prompts, offering a promising alternative to simplify animation workflows. However, existing models often prioritize physical realism over emotional expressiveness, an aspect that remains underexplored. This study evaluates emotional believability in animations generated by four text-to-motion models (LADiff, MDM, T2MGPT, Muse Animate). A user study, which involved 39 participants, assessed basic emotions portrayed during common actions, revealing that emotions such as anger and sadness were most recognizable via body language alone, while surprise and disgust posed greater challenges. These results align with previous body language research, emphasizing the strengths and limitations of AI generated emotional animations and offering insights to enhance virtual human expressiveness.File | Dimensione | Formato | |
---|---|---|---|
paper_final.pdf
accesso riservato
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
2.63 MB
Formato
Adobe PDF
|
2.63 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3000652