End-to-End Spoken Language Understanding models are generally evaluated according to their overall accuracy, or separately on (a priori defined) data subgroups of interest. We propose a technique for analyzing model performance at the subgroup level, which considers all subgroups that can be defined via a given set of metadata and are above a specified minimum size. The metadata can represent user characteristics, recording conditions, and speech targets. Our technique is based on advances in model bias analysis, enabling efficient exploration of resulting subgroups. A fine-grained analysis reveals how model performance varies across subgroups, identifying modeling issues or bias towards specific subgroups. We compare the subgroup-level performance of models based on wav2vec 2.0 and HuBERT on the Fluent Speech Commands dataset. The experimental results illustrate how subgroup-level analysis reveals a finer and more complete picture of performance changes when models are replaced, automatically identifying the subgroups that most benefit or fail to benefit from the change
Exploring Subgroup Performance In End-to-End Speech Models / Koudounas, Alkis; Pastor, Eliana; Attanasio, Giuseppe; Mazzia, Vittorio; Giollo, Manuel; Gueudre, Thomas; Cagliero, Luca; de Alfaro, Luca; Baralis, ELENA MARIA; Amberti, Daniele. - (2023), pp. 1-5. (Intervento presentato al convegno 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) tenutosi a Rhodes Island (Greece) nel 4-10 June 2023) [10.1109/ICASSP49357.2023.10095284].
Exploring Subgroup Performance In End-to-End Speech Models
Alkis Koudounas;Eliana Pastor;Luca Cagliero;Elena Baralis;
2023
Abstract
End-to-End Spoken Language Understanding models are generally evaluated according to their overall accuracy, or separately on (a priori defined) data subgroups of interest. We propose a technique for analyzing model performance at the subgroup level, which considers all subgroups that can be defined via a given set of metadata and are above a specified minimum size. The metadata can represent user characteristics, recording conditions, and speech targets. Our technique is based on advances in model bias analysis, enabling efficient exploration of resulting subgroups. A fine-grained analysis reveals how model performance varies across subgroups, identifying modeling issues or bias towards specific subgroups. We compare the subgroup-level performance of models based on wav2vec 2.0 and HuBERT on the Fluent Speech Commands dataset. The experimental results illustrate how subgroup-level analysis reveals a finer and more complete picture of performance changes when models are replaced, automatically identifying the subgroups that most benefit or fail to benefit from the changeFile | Dimensione | Formato | |
---|---|---|---|
ICASSP23__Divergent_behaviors_in_E2E_systems.pdf
accesso aperto
Descrizione: Articolo principale (postprint referato)
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
PUBBLICO - Tutti i diritti riservati
Dimensione
232.12 kB
Formato
Adobe PDF
|
232.12 kB | Adobe PDF | Visualizza/Apri |
Exploring_Subgroup_Performance_in_End-to-End_Speech_Models.pdf
non disponibili
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
923.66 kB
Formato
Adobe PDF
|
923.66 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2976783