Semi-supervised cross-lingual speech emotion recognition

Agarla, Mirko; Bianco, Simone; Celona, Luigi; Napoletano, Paolo; Petrovsky, Alexey; Piccoli, Flavio; Schettini, Raimondo; Shanin, Ivan

doi:10.1016/j.eswa.2023.121368

Performance in Speech Emotion Recognition (SER) on a single language has increased greatly in the last few years thanks to the use of deep learning techniques. However, cross-lingual SER remains a challenge in real-world applications due to two main factors: the first is the big gap among the source and the target domain distributions; the second factor is the major availability of unlabeled utterances in contrast to the labeled ones for the new language. Taking into account previous aspects, we propose a Semi-Supervised Learning (SSL) method for cross-lingual emotion recognition when only few labeled examples in the target domain (i.e. the new language) are available. Our method is based on a Transformer and it adapts to the new domain by exploiting a pseudo-labeling strategy on the unlabeled utterances. In particular, the use of a hard and soft pseudo-labels approach is investigated. We thoroughly evaluate the performance of the proposed method in a speaker-independent setup on both the source and the new language and show its robustness across five languages belonging to different linguistic strains. The experimental findings indicate that the unweighted accuracy is increased by an average of 40% compared to state-of-the-art methods.

Semi-supervised cross-lingual speech emotion recognition / Agarla, Mirko; Bianco, Simone; Celona, Luigi; Napoletano, Paolo; Petrovsky, Alexey; Piccoli, Flavio; Schettini, Raimondo; Shanin, Ivan. - In: EXPERT SYSTEMS WITH APPLICATIONS. - ISSN 0957-4174. - 237, Part A:(2023). [10.1016/j.eswa.2023.121368]

Semi-supervised cross-lingual speech emotion recognition

Mirko Agarla;Simone Bianco;Luigi Celona;Paolo Napoletano;Alexey Petrovsky;Flavio Piccoli;Raimondo Schettini;Ivan Shanin

2023

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
			2023
		
	Codice DOI
	
			https://dx.doi.org/10.1016/j.eswa.2023.121368
		
	Titolo della Rivista
	
			EXPERT SYSTEMS WITH APPLICATIONS
		
	Appare nelle tipologie
	
			1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S0957417423018705-main.pdf accesso aperto Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Creative commons Dimensione 2.79 MB Formato Adobe PDF Visualizza/Apri	2.79 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2982303

PORTO @ Archivio Istituzionale della Ricerca

Semi-supervised cross-lingual speech emotion recognition

Mirko Agarla;Simone Bianco;Luigi Celona;Paolo Napoletano;Alexey Petrovsky;Flavio Piccoli;Raimondo Schettini;Ivan Shanin

2023

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)