Recent large-scale Spoken Language Understanding datasets focus predominantly on English and do not account for language-specific phenomena such as particular phonemes or words in different lects. We introduce ITALIC, the first large-scale speech dataset designed for intent classification in Italian. The dataset comprises 16,521 crowdsourced audio samples recorded by 70 speakers from various Italian regions and annotated with intent labels and additional metadata. We explore the versatility of ITALIC by evaluating current state-of-the-art speech and text models. Results on intent classification suggest that increasing scale and running language adaptation yield better speech models, monolingual text models outscore multilingual ones, and that speech recognition on ITALIC is more challenging than on existing Italian benchmarks. We release both the dataset and the annotation scheme to streamline the development of new Italian SLU models and language-specific datasets.
ITALIC: An Italian Intent Classification Dataset / Koudounas, Alkis; LA QUATRA, Moreno; Vaiani, Lorenzo; Colomba, Luca; Attanasio, Giuseppe; Pastor, Eliana; Cagliero, Luca; Baralis, Elena. - ELETTRONICO. - (2023), pp. 2153-2157. (Intervento presentato al convegno INTERSPEECH 2023 tenutosi a Dublin (Ireland) nel 20 August - 24 August 2023) [10.21437/Interspeech.2023-1980].
ITALIC: An Italian Intent Classification Dataset
Alkis Koudounas;Moreno La Quatra;Lorenzo Vaiani;Luca Colomba;Giuseppe Attanasio;Eliana Pastor;Luca Cagliero;Elena Baralis
2023
Abstract
Recent large-scale Spoken Language Understanding datasets focus predominantly on English and do not account for language-specific phenomena such as particular phonemes or words in different lects. We introduce ITALIC, the first large-scale speech dataset designed for intent classification in Italian. The dataset comprises 16,521 crowdsourced audio samples recorded by 70 speakers from various Italian regions and annotated with intent labels and additional metadata. We explore the versatility of ITALIC by evaluating current state-of-the-art speech and text models. Results on intent classification suggest that increasing scale and running language adaptation yield better speech models, monolingual text models outscore multilingual ones, and that speech recognition on ITALIC is more challenging than on existing Italian benchmarks. We release both the dataset and the annotation scheme to streamline the development of new Italian SLU models and language-specific datasets.File | Dimensione | Formato | |
---|---|---|---|
italic_oa.pdf
accesso aperto
Descrizione: Open Access version
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
PUBBLICO - Tutti i diritti riservati
Dimensione
864.1 kB
Formato
Adobe PDF
|
864.1 kB | Adobe PDF | Visualizza/Apri |
koudounas23_interspeech.pdf
accesso aperto
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
PUBBLICO - Tutti i diritti riservati
Dimensione
919.92 kB
Formato
Adobe PDF
|
919.92 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2980659