Sound-based community emotion detection (SCED) estimates community emotion from environmental sounds. It has value for public safety and human–computer interaction. Current SCED models have limited adaptivity on complex audio and often need manual tuning. Objective: We aim to design an accurate and efficient automated SCED model for large-scale data. Methods: We propose a feature extraction framework that combines DBNPat feature generation with ATT-BP attention-driven binary compression. The framework adapts to signal characteristics with low computational cost. We also introduce a new dataset of 10,017 environmental sound clips (three seconds) with negative (n = 1,729), neutral (n = 6,154), and positive (n = 2,134) classes. Results: The proposed SCED model achieves 87.28% accuracy on three-class SCED. It yields 81.30% UAR, 84.71% precision, 82.97% F1, and 80.59% geometric mean on the imbalanced dataset. Conclusion: The model links classical feature design and deep pattern generation in one adaptive pipeline. It offers a practical solution for digital sound forensics and other ambient-audio systems that need fine emotion cues.

A novel approach using deep belief network patterns and attention binary decomposition for automated community emotion detection / Yildiz, Arif Metehan; Barua, Prabal Datta; Baygin, Mehmet; Dogan, Sengul; Tuncer, Turker; Salvi, Massimo; Tan, Ru-San; Acharya, U. R.. - In: BIOMEDICAL SIGNAL PROCESSING AND CONTROL. - ISSN 1746-8094. - 116:(2026). [10.1016/j.bspc.2026.109534]

A novel approach using deep belief network patterns and attention binary decomposition for automated community emotion detection

Salvi, Massimo;
2026

Abstract

Sound-based community emotion detection (SCED) estimates community emotion from environmental sounds. It has value for public safety and human–computer interaction. Current SCED models have limited adaptivity on complex audio and often need manual tuning. Objective: We aim to design an accurate and efficient automated SCED model for large-scale data. Methods: We propose a feature extraction framework that combines DBNPat feature generation with ATT-BP attention-driven binary compression. The framework adapts to signal characteristics with low computational cost. We also introduce a new dataset of 10,017 environmental sound clips (three seconds) with negative (n = 1,729), neutral (n = 6,154), and positive (n = 2,134) classes. Results: The proposed SCED model achieves 87.28% accuracy on three-class SCED. It yields 81.30% UAR, 84.71% precision, 82.97% F1, and 80.59% geometric mean on the imbalanced dataset. Conclusion: The model links classical feature design and deep pattern generation in one adaptive pipeline. It offers a practical solution for digital sound forensics and other ambient-audio systems that need fine emotion cues.
File in questo prodotto:
File Dimensione Formato  
(2026) paper - SCED emotion community.pdf

accesso aperto

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Creative commons
Dimensione 3.38 MB
Formato Adobe PDF
3.38 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3006485