This work presents a new and efficient approach to discriminative speaker verification in the i–vector space. We illustrate the development of a linear discriminative classifier that is trained to discriminate between the hypothesis that a pair of feature vectors in a trial belong to the same speaker or to different speakers. This approach is alternative to the usual discriminative setup that discriminates between a speaker and all the other speakers. We use a discriminative classifier based on a Support Vector Machine (SVM) that is trained to estimate the parameters of a symmetric quadratic function approximating a log–likelihood ratio score without explicit modeling of the i–vector distributions as in the generative Probabilistic Linear Discriminant Analysis (PLDA) models. Training these models is feasible because it is not necessary to expand the i–vector pairs, which would be expensive or even impossible even for medium sized training sets. The results of experiments performed on the tel-tel extended core condition of the NIST 2010 Speaker Recognition Evaluation are competitive with the ones obtained by generative models, in terms of normalized Detection Cost Function and Equal Error Rate. Moreover, we show that it is possible to train a gender– independent discriminative model that achieves state–of–the–art accuracy, comparable to the one of a gender–dependent system, saving memory and execution time both in training and in testing

Pairwise Discriminative Speaker Verification in the I-Vector Space / Cumani, Sandro; Brummer, N.; Burget, L.; Laface, Pietro; Plchot, O.; Vasilakakis, Vasileios. - In: IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. - ISSN 1558-7916. - STAMPA. - 21:6(2013), pp. 1217-1227. [10.1109/TASL.2013.2245655]

Pairwise Discriminative Speaker Verification in the I-Vector Space

CUMANI, SANDRO;LAFACE, Pietro;VASILAKAKIS, VASILEIOS
2013

Abstract

This work presents a new and efficient approach to discriminative speaker verification in the i–vector space. We illustrate the development of a linear discriminative classifier that is trained to discriminate between the hypothesis that a pair of feature vectors in a trial belong to the same speaker or to different speakers. This approach is alternative to the usual discriminative setup that discriminates between a speaker and all the other speakers. We use a discriminative classifier based on a Support Vector Machine (SVM) that is trained to estimate the parameters of a symmetric quadratic function approximating a log–likelihood ratio score without explicit modeling of the i–vector distributions as in the generative Probabilistic Linear Discriminant Analysis (PLDA) models. Training these models is feasible because it is not necessary to expand the i–vector pairs, which would be expensive or even impossible even for medium sized training sets. The results of experiments performed on the tel-tel extended core condition of the NIST 2010 Speaker Recognition Evaluation are competitive with the ones obtained by generative models, in terms of normalized Detection Cost Function and Equal Error Rate. Moreover, we show that it is possible to train a gender– independent discriminative model that achieves state–of–the–art accuracy, comparable to the one of a gender–dependent system, saving memory and execution time both in training and in testing
File in questo prodotto:
File Dimensione Formato  
manuscript-R2-open.pdf

accesso aperto

Tipologia: 1. Preprint / submitted version [pre- review]
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 997.95 kB
Formato Adobe PDF
997.95 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2506145
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo