As speech processing moves toward more data-hungry models, data selection and acquisition become crucial to building better systems. Recent efforts have championed quantity over quality, following the mantra "The more data, the better." However, not every data brings the same benefit. This paper proposes a data acquisition solution that yields better models with less data – and lower cost. Given a model, a task, and an objective to maximize, we propose a process with three steps. First, we assess the model’s baseline performance on the task. Second, we use efficient mining techniques to identify subgroups that maximize the target objective if acquired first as new samples. Being the subgroups interpretable, we can determine which samples to acquire. Third, we run incremental training sampling from those subgroups. Experiments with two state-of-the-art speech models for Intent Classification across two datasets in English and Italian show that our method is significantly better than random or complete acquisition and clustering-based techniques.

Prioritizing Data Acquisition For End-to-End Speech Model Improvement / Koudounas, Alkis; Pastor, Eliana; Attanasio, Giuseppe; de Alfaro, Luca; Baralis, Elena. - ELETTRONICO. - (2024), pp. 7000-7004. (Intervento presentato al convegno 2024 IEEE International Conference on Acoustics, Speech and Signal Processing tenutosi a Seoul (KOR) nel 14-19 April 2024) [10.1109/ICASSP48485.2024.10446326].

Prioritizing Data Acquisition For End-to-End Speech Model Improvement

Koudounas, Alkis;Pastor, Eliana;Attanasio, Giuseppe;de Alfaro, Luca;Baralis, Elena
2024

Abstract

As speech processing moves toward more data-hungry models, data selection and acquisition become crucial to building better systems. Recent efforts have championed quantity over quality, following the mantra "The more data, the better." However, not every data brings the same benefit. This paper proposes a data acquisition solution that yields better models with less data – and lower cost. Given a model, a task, and an objective to maximize, we propose a process with three steps. First, we assess the model’s baseline performance on the task. Second, we use efficient mining techniques to identify subgroups that maximize the target objective if acquired first as new samples. Being the subgroups interpretable, we can determine which samples to acquire. Third, we run incremental training sampling from those subgroups. Experiments with two state-of-the-art speech models for Intent Classification across two datasets in English and Italian show that our method is significantly better than random or complete acquisition and clustering-based techniques.
2024
979-8-3503-4485-1
File in questo prodotto:
File Dimensione Formato  
Paper.pdf

accesso aperto

Descrizione: Prioritizing Data Acquisition For End-to-End Speech Model Improvement
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 173.96 kB
Formato Adobe PDF
173.96 kB Adobe PDF Visualizza/Apri
Prioritizing_Data_Acquisition_for_end-to-end_Speech_Model_Improvement.pdf

non disponibili

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 871.17 kB
Formato Adobe PDF
871.17 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2986419