Large scale training of Pairwise Support Vector Machines for speaker recognition

Cumani, Sandro; Laface, Pietro

doi:10.1109/TASLP.2014.2341914

State–of–the–art systems for text–independent speaker recognition use as their features a compact representation of a speaker utterance, known as “i–vector”. We recently presented an efficient approach for training a Pairwise Support Vector Machine (PSVM) with a suitable kernel for i–vector pairs for a quite large speaker recognition task. Rather than estimating an SVM model per speaker, according to the “one versus all” discriminative paradigm, the PSVM approach classifies a trial, consisting of a pair of i–vectors, as belonging or not to the same speaker class. Training a PSVM with large amount of data, however, is a memory and computational expensive task, because the number of training pairs grows quadratically with the number of training i–vectors. This paper demonstrates that a very small subset of the training pairs is necessary to train the original PSVM model, and proposes two approaches that allow discarding most of the training pairs that are not essential, without harming the accuracy of the model. This allows dramatically reducing the memory and computational resources needed for training, which becomes feasible with large datasets including many speakers. We have assessed these approaches on the extended core conditions of the NIST 2012 Speaker Recognition Evaluation. Our results show that the accuracy of the PSVM trained with a sufficient number of speakers is 10-30% better compared to the one obtained by a PLDA model, depending on the testing conditions. Since the PSVM accuracy increases with the training set size, but PSVM training does not scale well for large numbers of speakers, our selection techniques become relevant for training accurate discriminative classifiers.

Large scale training of Pairwise Support Vector Machines for speaker recognition / Cumani, Sandro; Laface, Pietro. - In: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. - ISSN 2329-9290. - STAMPA. - 22:11(2014), pp. 1590-1600. [10.1109/TASLP.2014.2341914]

Large scale training of Pairwise Support Vector Machines for speaker recognition

CUMANI, SANDRO;LAFACE, Pietro

2014

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2014
			
	Codice DOI
	
				https://dx.doi.org/10.1109/TASLP.2014.2341914
			
	Titolo della Rivista
	
				IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
manuscript-R1-open .pdf accesso aperto Tipologia: 1. Preprint / submitted version [pre- review] Licenza: Pubblico - Tutti i diritti riservati Dimensione 954.5 kB Formato Adobe PDF Visualizza/Apri	954.5 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2555345

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

PORTO @ Archivio Istituzionale della Ricerca

Large scale training of Pairwise Support Vector Machines for speaker recognition

CUMANI, SANDRO;LAFACE, Pietro

2014

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Attenzione

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)