Sound-based community emotion detection (SCED) estimates community emotion from environmental sounds. It has value for public safety and human–computer interaction. Current SCED models have limited adaptivity on complex audio and often need manual tuning. Objective: We aim to design an accurate and efficient automated SCED model for large-scale data. Methods: We propose a feature extraction framework that combines DBNPat feature generation with ATT-BP attention-driven binary compression. The framework adapts to signal characteristics with low computational cost. We also introduce a new dataset of 10,017 environmental sound clips (three seconds) with negative (n = 1,729), neutral (n = 6,154), and positive (n = 2,134) classes. Results: The proposed SCED model achieves 87.28% accuracy on three-class SCED. It yields 81.30% UAR, 84.71% precision, 82.97% F1, and 80.59% geometric mean on the imbalanced dataset. Conclusion: The model links classical feature design and deep pattern generation in one adaptive pipeline. It offers a practical solution for digital sound forensics and other ambient-audio systems that need fine emotion cues.
A novel approach using deep belief network patterns and attention binary decomposition for automated community emotion detection / Yildiz, Arif Metehan; Barua, Prabal Datta; Baygin, Mehmet; Dogan, Sengul; Tuncer, Turker; Salvi, Massimo; Tan, Ru-San; Acharya, U. R.. - In: BIOMEDICAL SIGNAL PROCESSING AND CONTROL. - ISSN 1746-8094. - 116:(2026). [10.1016/j.bspc.2026.109534]
A novel approach using deep belief network patterns and attention binary decomposition for automated community emotion detection
Salvi, Massimo;
2026
Abstract
Sound-based community emotion detection (SCED) estimates community emotion from environmental sounds. It has value for public safety and human–computer interaction. Current SCED models have limited adaptivity on complex audio and often need manual tuning. Objective: We aim to design an accurate and efficient automated SCED model for large-scale data. Methods: We propose a feature extraction framework that combines DBNPat feature generation with ATT-BP attention-driven binary compression. The framework adapts to signal characteristics with low computational cost. We also introduce a new dataset of 10,017 environmental sound clips (three seconds) with negative (n = 1,729), neutral (n = 6,154), and positive (n = 2,134) classes. Results: The proposed SCED model achieves 87.28% accuracy on three-class SCED. It yields 81.30% UAR, 84.71% precision, 82.97% F1, and 80.59% geometric mean on the imbalanced dataset. Conclusion: The model links classical feature design and deep pattern generation in one adaptive pipeline. It offers a practical solution for digital sound forensics and other ambient-audio systems that need fine emotion cues.| File | Dimensione | Formato | |
|---|---|---|---|
|
(2026) paper - SCED emotion community.pdf
accesso aperto
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Creative commons
Dimensione
3.38 MB
Formato
Adobe PDF
|
3.38 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3006485
