Speech models may exhibit disparities in performance across different population subgroups. Prior mitigation efforts often rely on the manual user-driven selection of predefined data subgroups of interest. However, they fail to correctly identify all relevant subgroups associated with performance issues. We propose to mitigate performance disparities of subgroups that underperform, i.e., exhibit a divergence, relative to overall model performance. We tackle the performance disparities from two alternative perspectives - an in-processing one, implementing mitigation measures during model development, and a post-processing one, refining already trained models. For the in-processing scenario, we propose two approaches: a divergence-based regularization and a data augmentation technique to boost subgroup performance during model fine-tuning. The post-processing strategy introduces a divergence-aware data acquisition method to prioritize acquiring samples from underperforming subgroups. Experiments on a dataset for Automatic Speech Recognition, one for Emotion Recognition, and two datasets for Intent Classification in English and Italian highlight the improvement achieved by the divergence-aware strategies, which significantly reduce performance disparities and outperform traditional clustering-, KNN-, error-driven-, and random-based methods.

Mitigating Subgroup Disparities in Speech Models: A Divergence-Aware Dual Strategy / Koudounas, Alkis; Pastor, Eliana; Alfaro, Luca de; Baralis, Elena. - In: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. - ISSN 2329-9290. - 33:(2025), pp. 883-895. [10.1109/taslpro.2025.3539429]

Mitigating Subgroup Disparities in Speech Models: A Divergence-Aware Dual Strategy

Koudounas, Alkis;Pastor, Eliana;Baralis, Elena
2025

Abstract

Speech models may exhibit disparities in performance across different population subgroups. Prior mitigation efforts often rely on the manual user-driven selection of predefined data subgroups of interest. However, they fail to correctly identify all relevant subgroups associated with performance issues. We propose to mitigate performance disparities of subgroups that underperform, i.e., exhibit a divergence, relative to overall model performance. We tackle the performance disparities from two alternative perspectives - an in-processing one, implementing mitigation measures during model development, and a post-processing one, refining already trained models. For the in-processing scenario, we propose two approaches: a divergence-based regularization and a data augmentation technique to boost subgroup performance during model fine-tuning. The post-processing strategy introduces a divergence-aware data acquisition method to prioritize acquiring samples from underperforming subgroups. Experiments on a dataset for Automatic Speech Recognition, one for Emotion Recognition, and two datasets for Intent Classification in English and Italian highlight the improvement achieved by the divergence-aware strategies, which significantly reduce performance disparities and outperform traditional clustering-, KNN-, error-driven-, and random-based methods.
File in questo prodotto:
File Dimensione Formato  
Mitigating_Subgroup_Disparities_in_Speech_Models_A_Divergence-Aware_Dual_Strategy.pdf

accesso riservato

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 2.69 MB
Formato Adobe PDF
2.69 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Mitigating_Subgroup_Disparities_in_Speech_Models_ postprint.pdf

accesso aperto

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Pubblico - Tutti i diritti riservati
Dimensione 742.24 kB
Formato Adobe PDF
742.24 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2997382