Speaker recognition systems attain their best accuracy when trained with gender dependent features and tested with known gender trials. In real applications, however, gender labels are often not given. In this work we illustrate the design of a system that does not make use of the gender labels both in training and in test, i.e. a completely Gender Independent (GI) system. It relies on discriminative training, where the trials are i–vector pairs, and the discrimination is between the hypothesis that the pair of feature vectors in the trial belong to the same speaker or to different speakers. We demonstrate that this pairwise discriminative training can be interpreted as a procedure that estimates the parameters of the best (second order) approximation of the log–likelihood ratio score function, and that a pairwise SVM can be used for training a gender independent system. Our results show that a pairwise GI SVM, saving memory and execution time, achieves on the last NIST evaluations state–of–the–art performance, comparable to a Gender Dependent(GD) system.

GENDER INDEPENDENT DISCRIMINATIVE SPEAKER RECOGNITION IN I–VECTOR SPACE / Cumani, Sandro; Glembek, O.; Br¨ummer, N.; De Villiers, E.; Laface, Pietro. - STAMPA. - (2012), pp. 4361-4364. (Intervento presentato al convegno International Conference Acoustics Speech and Signal Processing tenutosi a Kyoto nel 24-30 March 2012).

GENDER INDEPENDENT DISCRIMINATIVE SPEAKER RECOGNITION IN I–VECTOR SPACE

CUMANI, SANDRO;LAFACE, Pietro
2012

Abstract

Speaker recognition systems attain their best accuracy when trained with gender dependent features and tested with known gender trials. In real applications, however, gender labels are often not given. In this work we illustrate the design of a system that does not make use of the gender labels both in training and in test, i.e. a completely Gender Independent (GI) system. It relies on discriminative training, where the trials are i–vector pairs, and the discrimination is between the hypothesis that the pair of feature vectors in the trial belong to the same speaker or to different speakers. We demonstrate that this pairwise discriminative training can be interpreted as a procedure that estimates the parameters of the best (second order) approximation of the log–likelihood ratio score function, and that a pairwise SVM can be used for training a gender independent system. Our results show that a pairwise GI SVM, saving memory and execution time, achieves on the last NIST evaluations state–of–the–art performance, comparable to a Gender Dependent(GD) system.
File in questo prodotto:
File Dimensione Formato  
cumani.pdf

accesso aperto

Tipologia: 1. Preprint / submitted version [pre- review]
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 52.9 kB
Formato Adobe PDF
52.9 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2496065
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo