Speech models may exhibit disparities in performance across different population subgroups. Prior mitigation efforts often rely on the manual user-driven selection of predefined data subgroups of interest. However, they fail to correctly identify all relevant subgroups associated with performance issues. We propose to mitigate performance disparities of subgroups that underperform, i.e., exhibit a divergence, relative to overall model performance. We tackle the performance disparities from two alternative perspectives - an in-processing one, implementing mitigation measures during model development, and a post-processing one, refining already trained models. For the in-processing scenario, we propose two approaches: a divergence-based regularization and a data augmentation technique to boost subgroup performance during model fine-tuning. The post-processing strategy introduces a divergence-aware data acquisition method to prioritize acquiring samples from underperforming subgroups. Experiments on a dataset for Automatic Speech Recognition, one for Emotion Recognition, and two datasets for Intent Classification in English and Italian highlight the improvement achieved by the divergence-aware strategies, which significantly reduce performance disparities and outperform traditional clustering-, KNN-, error-driven-, and random-based methods.
Mitigating Subgroup Disparities in Speech Models: A Divergence-Aware Dual Strategy / Koudounas, Alkis; Pastor, Eliana; Alfaro, Luca de; Baralis, Elena. - In: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. - ISSN 2329-9290. - 33:(2025), pp. 883-895. [10.1109/taslpro.2025.3539429]
Mitigating Subgroup Disparities in Speech Models: A Divergence-Aware Dual Strategy
Koudounas, Alkis;Pastor, Eliana;Baralis, Elena
2025
Abstract
Speech models may exhibit disparities in performance across different population subgroups. Prior mitigation efforts often rely on the manual user-driven selection of predefined data subgroups of interest. However, they fail to correctly identify all relevant subgroups associated with performance issues. We propose to mitigate performance disparities of subgroups that underperform, i.e., exhibit a divergence, relative to overall model performance. We tackle the performance disparities from two alternative perspectives - an in-processing one, implementing mitigation measures during model development, and a post-processing one, refining already trained models. For the in-processing scenario, we propose two approaches: a divergence-based regularization and a data augmentation technique to boost subgroup performance during model fine-tuning. The post-processing strategy introduces a divergence-aware data acquisition method to prioritize acquiring samples from underperforming subgroups. Experiments on a dataset for Automatic Speech Recognition, one for Emotion Recognition, and two datasets for Intent Classification in English and Italian highlight the improvement achieved by the divergence-aware strategies, which significantly reduce performance disparities and outperform traditional clustering-, KNN-, error-driven-, and random-based methods.File | Dimensione | Formato | |
---|---|---|---|
Mitigating_Subgroup_Disparities_in_Speech_Models_A_Divergence-Aware_Dual_Strategy.pdf
accesso riservato
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
2.69 MB
Formato
Adobe PDF
|
2.69 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Mitigating_Subgroup_Disparities_in_Speech_Models_ postprint.pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
742.24 kB
Formato
Adobe PDF
|
742.24 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2997382