As speech processing moves toward more data-hungry models, data selection and acquisition become crucial to building better systems. Recent efforts have championed quantity over quality, following the mantra "The more data, the better." However, not every data brings the same benefit. This paper proposes a data acquisition solution that yields better models with less data – and lower cost. Given a model, a task, and an objective to maximize, we propose a process with three steps. First, we assess the model’s baseline performance on the task. Second, we use efficient mining techniques to identify subgroups that maximize the target objective if acquired first as new samples. Being the subgroups interpretable, we can determine which samples to acquire. Third, we run incremental training sampling from those subgroups. Experiments with two state-of-the-art speech models for Intent Classification across two datasets in English and Italian show that our method is significantly better than random or complete acquisition and clustering-based techniques.
Prioritizing Data Acquisition For End-to-End Speech Model Improvement / Koudounas, Alkis; Pastor, Eliana; Attanasio, Giuseppe; de Alfaro, Luca; Baralis, Elena. - ELETTRONICO. - (2024), pp. 7000-7004. (Intervento presentato al convegno 2024 IEEE International Conference on Acoustics, Speech and Signal Processing tenutosi a Seoul (KOR) nel 14-19 April 2024) [10.1109/ICASSP48485.2024.10446326].
Prioritizing Data Acquisition For End-to-End Speech Model Improvement
Koudounas, Alkis;Pastor, Eliana;Attanasio, Giuseppe;de Alfaro, Luca;Baralis, Elena
2024
Abstract
As speech processing moves toward more data-hungry models, data selection and acquisition become crucial to building better systems. Recent efforts have championed quantity over quality, following the mantra "The more data, the better." However, not every data brings the same benefit. This paper proposes a data acquisition solution that yields better models with less data – and lower cost. Given a model, a task, and an objective to maximize, we propose a process with three steps. First, we assess the model’s baseline performance on the task. Second, we use efficient mining techniques to identify subgroups that maximize the target objective if acquired first as new samples. Being the subgroups interpretable, we can determine which samples to acquire. Third, we run incremental training sampling from those subgroups. Experiments with two state-of-the-art speech models for Intent Classification across two datasets in English and Italian show that our method is significantly better than random or complete acquisition and clustering-based techniques.File | Dimensione | Formato | |
---|---|---|---|
Paper.pdf
accesso aperto
Descrizione: Prioritizing Data Acquisition For End-to-End Speech Model Improvement
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
PUBBLICO - Tutti i diritti riservati
Dimensione
173.96 kB
Formato
Adobe PDF
|
173.96 kB | Adobe PDF | Visualizza/Apri |
Prioritizing_Data_Acquisition_for_end-to-end_Speech_Model_Improvement.pdf
non disponibili
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
871.17 kB
Formato
Adobe PDF
|
871.17 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2986419