Joint estimation of PLDA and nonlinear transformations of speaker vectors

Cumani, Sandro; Laface, Pietro

doi:10.1109/TASLP.2017.2724198

The Gaussian probabilistic linear discriminant anal-ysis (PLDA) model assumes Gaussian distributed priors for the latent variables that represent the speaker and channel factors. Assuming that each training i-vector belongs to a different speaker, as is usually done in i-vector extraction, i-vectors generated by a PLDA model can be considered independent and identically distributed with Gaussian distribution. Thus, we have recently proposed to transform the development i-vectors so that their distribution becomes more Gaussian-like. This is obtained by means of a sequence of affine and nonlinear transformations whose parameters are trained by maximum likelihood (ML) estimation on the development set. The evaluation i-vectors are then subject to the same transformation. Although the i-vector “gaussianization” has shown to be effective, since the i-vectors extracted from segments of the same speaker are not independent, the original assumption is not satisfactory. In this work, we show that the model can be improved by properly exploiting the information about the speaker labels, which was ignored in the previous model. In particular, a more effective PLDA model can be obtained by jointly estimating the PLDA parameters and the parameters of the nonlinear transformation of the i-vectors. In other words, while the goal of the previous approach was to “gaussianize” the training i-vectors distribution, the objective of this work is to embed the estimation of the nonlinear i-vector transformation in the PLDA model estimation. We will thus refer to this model as the nonlinear PLDA model. We show that this new approach provides significant gain with respect to PLDA, and a small, yet consistent, improvement with respect to our former i-vector “gaussianization” approach, without further additional costs.

Joint estimation of PLDA and nonlinear transformations of speaker vectors / Cumani, Sandro; Laface, Pietro. - In: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. - ISSN 2329-9290. - STAMPA. - 25:10(2017), pp. 1890-1900. [10.1109/TASLP.2017.2724198]

Joint estimation of PLDA and nonlinear transformations of speaker vectors

CUMANI, SANDRO;LAFACE, Pietro

2017

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2017
			
	Codice DOI
	
				https://dx.doi.org/10.1109/TASLP.2017.2724198
			
	Titolo della Rivista
	
				IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
FINAL VERSION.PDF accesso aperto Descrizione: Articolo principale Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: PUBBLICO - Tutti i diritti riservati Dimensione 2.12 MB Formato Adobe PDF Visualizza/Apri	2.12 MB	Adobe PDF	Visualizza/Apri
07971950.pdf non disponibili Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 447.38 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	447.38 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2685014

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

PORTO @ Archivio Istituzionale della Ricerca

Joint estimation of PLDA and nonlinear transformations of speaker vectors

CUMANI, SANDRO;LAFACE, Pietro

2017

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Attenzione

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)