Speaker Recognition Using e–Vectors

Cumani, Sandro; Laface, Pietro

doi:10.1109/TASLP.2018.2791806

Systems based on i–vectors represent the current state–of–the–art in text-independent speaker recognition. Unlike joint factor analysis (JFA), which models both speaker and intersession subspaces separately, in the i–vector approach all the important variability is modeled in a single low-dimensional subspace. This paper is based on the observation that JFA estimates a more informative speaker subspace than the “total variability” i–vector subspace, because the latter is obtained by considering each training segment as belonging to a different speaker. We propose a speaker modeling approach that extracts a compact representation of a speech segment, similar to the speaker factors of JFA and to i–vectors, referred to as “e–vector.” Estimating the e–vector subspace follows a procedure similar to i–vector training, but produces a more accurate speaker subspace, as confirmed by the results of a set of tests performed on the NIST 2012 and 2010 Speaker Recognition Evaluations. Simply replacing the i–vectors with e–vectors we get approximately 10% average improvement of the C_primary cost function, using different systems and classifiers. It is worth noting that these performance gains come without any additional memory or computational costs with respect to the standard i–vector systems.

Speaker Recognition Using e–Vectors / Cumani, Sandro; Laface, Pietro. - In: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. - ISSN 2329-9290. - STAMPA. - 26:4(2018), pp. 736-748. [10.1109/TASLP.2018.2791806]

Speaker Recognition Using e–Vectors

Cumani, Sandro;Laface, Pietro

2018

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2018
			
	Codice DOI
	
				https://dx.doi.org/10.1109/TASLP.2018.2791806
			
	Titolo della Rivista
	
				IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
FINAL VERSION.pdf accesso aperto Descrizione: Speaker recognition using e-vectors TASLP 2018 Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: Pubblico - Tutti i diritti riservati Dimensione 2.23 MB Formato Adobe PDF Visualizza/Apri	2.23 MB	Adobe PDF	Visualizza/Apri
08253498.pdf accesso riservato Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 693.9 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	693.9 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2699467

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

PORTO @ Archivio Istituzionale della Ricerca

Speaker Recognition Using e–Vectors

Cumani, Sandro;Laface, Pietro

2018

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Attenzione

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)