Recent large-scale Spoken Language Understanding datasets focus predominantly on English and do not account for language-specific phenomena such as particular phonemes or words in different lects. We introduce ITALIC, the first large-scale speech dataset designed for intent classification in Italian. The dataset comprises 16,521 crowdsourced audio samples recorded by 70 speakers from various Italian regions and annotated with intent labels and additional metadata. We explore the versatility of ITALIC by evaluating current state-of-the-art speech and text models. Results on intent classification suggest that increasing scale and running language adaptation yield better speech models, monolingual text models outscore multilingual ones, and that speech recognition on ITALIC is more challenging than on existing Italian benchmarks. We release both the dataset and the annotation scheme to streamline the development of new Italian SLU models and language-specific datasets.

ITALIC: An Italian Intent Classification Dataset / Koudounas, Alkis; LA QUATRA, Moreno; Vaiani, Lorenzo; Colomba, Luca; Attanasio, Giuseppe; Pastor, Eliana; Cagliero, Luca; Baralis, Elena. - ELETTRONICO. - (2023), pp. 2153-2157. (Intervento presentato al convegno INTERSPEECH 2023 tenutosi a Dublin (Ireland) nel 20 August - 24 August 2023) [10.21437/Interspeech.2023-1980].

ITALIC: An Italian Intent Classification Dataset

Alkis Koudounas;Moreno La Quatra;Lorenzo Vaiani;Luca Colomba;Giuseppe Attanasio;Eliana Pastor;Luca Cagliero;Elena Baralis
2023

Abstract

Recent large-scale Spoken Language Understanding datasets focus predominantly on English and do not account for language-specific phenomena such as particular phonemes or words in different lects. We introduce ITALIC, the first large-scale speech dataset designed for intent classification in Italian. The dataset comprises 16,521 crowdsourced audio samples recorded by 70 speakers from various Italian regions and annotated with intent labels and additional metadata. We explore the versatility of ITALIC by evaluating current state-of-the-art speech and text models. Results on intent classification suggest that increasing scale and running language adaptation yield better speech models, monolingual text models outscore multilingual ones, and that speech recognition on ITALIC is more challenging than on existing Italian benchmarks. We release both the dataset and the annotation scheme to streamline the development of new Italian SLU models and language-specific datasets.
File in questo prodotto:
File Dimensione Formato  
italic_oa.pdf

accesso aperto

Descrizione: Open Access version
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 864.1 kB
Formato Adobe PDF
864.1 kB Adobe PDF Visualizza/Apri
koudounas23_interspeech.pdf

accesso aperto

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 919.92 kB
Formato Adobe PDF
919.92 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2980659