Facial emotion recognition is a valuable tool in healthcare, providing insights into emotional well-being, developmental progress, and health-related behaviors. This study presents a novel framework integrating deep learning with explainable artificial intelligence (XAI) to enhance emotion recognition from video data. Using the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), the framework begins with preprocessing, where 3D face meshes with 478 landmarks are generated using MediaPipe, and regions of interest (ROI) are extracted. Data augmentation techniques, including rotation, scaling, and translation, improve dataset variability. Feature extraction is performed using a fine-tuned Xception deep convolutional neural network, followed by classification using supervised machine learning algorithms such as SVM, KNN, ensemble methods, and ANN. Among these, the Fine Gaussian SVM (FGSVM) achieved the highest performance, with 93.87 % accuracy on both validation and test sets. The validation precision, recall, and F1-score were 94.06 %, 93.79 %, and 93.93 %, respectively, while the test set recorded 94.01 %, 93.74 %, and 93.88 %. To ensure interpretability, XAI techniques such as Grad-CAM, LIME, sensitivity occlusion, and SHAP highlight crucial facial landmarks and temporal frames influencing predictions. This study underscores the potential of combining deep learning with XAI to enhance reliability in healthcare applications, improving clinical decision-making, mental health monitoring, and human-computer interaction. A Python-based implementation of the proposed framework is available at: 10.5281/zenodo.14809940.

Explainable Emotion Recognition Using Xception-Based Feature Extraction and Supervised Machine Learning on the RAVDESS Dataset / Shah, Syed Taimoor Hussain; Shah, Syed Adil Hussain; Panagiotopoulos, Konstantinos; Pigueiras-del-Real, Janet; Qayyum, Kainat; Baqir Hussain Shah, Syed; Buccoliero, Andrea; Di Terlizzi, Angelo; Deriu, Marco Agostino. - (2025), pp. -1. (Intervento presentato al convegno 2025 IEEE Medical Measurements & Applications (MeMeA) tenutosi a Chania, Greece) [10.1109/memea65319.2025.11068008].

Explainable Emotion Recognition Using Xception-Based Feature Extraction and Supervised Machine Learning on the RAVDESS Dataset

Syed Taimoor Hussain Shah;Syed Adil Hussain Shah;Konstantinos Panagiotopoulos;Marco Agostino Deriu
2025

Abstract

Facial emotion recognition is a valuable tool in healthcare, providing insights into emotional well-being, developmental progress, and health-related behaviors. This study presents a novel framework integrating deep learning with explainable artificial intelligence (XAI) to enhance emotion recognition from video data. Using the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), the framework begins with preprocessing, where 3D face meshes with 478 landmarks are generated using MediaPipe, and regions of interest (ROI) are extracted. Data augmentation techniques, including rotation, scaling, and translation, improve dataset variability. Feature extraction is performed using a fine-tuned Xception deep convolutional neural network, followed by classification using supervised machine learning algorithms such as SVM, KNN, ensemble methods, and ANN. Among these, the Fine Gaussian SVM (FGSVM) achieved the highest performance, with 93.87 % accuracy on both validation and test sets. The validation precision, recall, and F1-score were 94.06 %, 93.79 %, and 93.93 %, respectively, while the test set recorded 94.01 %, 93.74 %, and 93.88 %. To ensure interpretability, XAI techniques such as Grad-CAM, LIME, sensitivity occlusion, and SHAP highlight crucial facial landmarks and temporal frames influencing predictions. This study underscores the potential of combining deep learning with XAI to enhance reliability in healthcare applications, improving clinical decision-making, mental health monitoring, and human-computer interaction. A Python-based implementation of the proposed framework is available at: 10.5281/zenodo.14809940.
2025
979-8-3315-2347-3
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3001792
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo