Scoring heterogeneous speaker vectors using nonlinear transformations and tied PLDa models

Cumani, Sandro; Laface, Pietro

doi:10.1109/TASLP.2018.2806305

Most current state-of-the-art text-independent speaker recognition systems are based on i-vectors, and on probabilistic linear discriminant analysis (PLDA). PLDA assumes that the i-vectors of a trial are homogeneous, i.e., that they have been extracted by the same system. In other words, the enrollment and test i-vectors belong to the same class. However, it is sometimes important to score trials including “heterogeneous” i-vectors, for instance, enrollment i-vectors extracted by an old system, and test i-vectors extracted by a newer, more accurate, system. In this paper, we introduce a PLDA model that is able to score heterogeneous i-vectors independent of their extraction approach, dimensions, and any other characteristics that make a set of i-vectors of the same speaker belong to different classes. The new model, which will be referred to as nonlinear tied-PLDA (NL-Tied-PLDA), is obtained by a generalization of our recently proposed nonlinear PLDA approach, which jointly estimates the PLDA parameters and the parameters of a nonlinear transformation of the i-vectors. The generalization consists of estimating a class-dependent nonlinear transformation of the i-vectors, with the constraint that the transformed i-vectors of the same speaker share the same speaker factor. The resulting model is flexible and accurate, as assessed by the results of a set of experiments performed on the extended core NIST SRE 2012 evaluation. In particular, NL-Tied-PLDA provides better results on heterogeneous trials with respect to the corresponding homogeneous trials scored by the old system, and, in some configurations, it also reaches the accuracy of the new system. Similar results were obtained on the female-extended core NIST SRE 2010 telephone condition.

Scoring heterogeneous speaker vectors using nonlinear transformations and tied PLDa models / Cumani, S., Laface, P.. - In: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. - ISSN 2329-9290. - STAMPA. - 26:5(2018), pp. 995-1009. [10.1109/TASLP.2018.2806305]

Scoring heterogeneous speaker vectors using nonlinear transformations and tied PLDa models

Cumani, Sandro;Laface, Pietro

2018

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2018
			
	Codice DOI
	
				https://dx.doi.org/10.1109/TASLP.2018.2806305
			
	Titolo della Rivista
	
				IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
18_Heterogeneous-postprint.pdf accesso aperto Descrizione: Articolo principale Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: Pubblico - Tutti i diritti riservati Dimensione 197.16 kB Formato Adobe PDF Visualizza/Apri	197.16 kB	Adobe PDF	Visualizza/Apri
08293802.pdf accesso riservato Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 333.83 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	333.83 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2707141

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

PORTO @ Archivio Istituzionale della Ricerca

Scoring heterogeneous speaker vectors using nonlinear transformations and tied PLDa models

Cumani, Sandro;Laface, Pietro

2018

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Attenzione

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)