A speaker verification system decides whether two voice segments belong to the same speaker based on a threshold. An optimal threshold can be set if the recognition scores are well calibrated, i.e., they represent Log–Likelihood Ratios. Logistic Regression (LogReg) is a standard approach for score calibration. While training this discriminative model requires labeled scores, Gaussian and non–Gaussian generative calibration models have been recently proposed. They not only have similar or better performance with respect to LogReg, but also allow for unsupervised or semi–supervised training of the models.The goal of this work is to extend these models. In particular, we show that normal variance–mean mixture distributions are able to model well–calibrated non–Gaussian distributed scores, provided that their parameters for the target and non–target score distributions are properly tied. As for the Gaussian case, a linear calibration model can then be estimated by computing Maximum Likelihood estimates of the distributions parameters and of the score transformation. The quality of all these approaches has been compared on a dataset of segments of variable duration obtained by cutting the NIST 2010 evaluation test data.

Tied Normal Variance-Mean Mixtures for Linear Score Calibration / Cumani, Sandro; Laface, Pietro. - ELETTRONICO. - (2019), pp. 6121-6125. ((Intervento presentato al convegno 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) tenutosi a Brighton (UK) nel 12-17 May 2019 [10.1109/ICASSP.2019.8683379].

Tied Normal Variance-Mean Mixtures for Linear Score Calibration

Cumani, Sandro;Laface, Pietro
2019

Abstract

A speaker verification system decides whether two voice segments belong to the same speaker based on a threshold. An optimal threshold can be set if the recognition scores are well calibrated, i.e., they represent Log–Likelihood Ratios. Logistic Regression (LogReg) is a standard approach for score calibration. While training this discriminative model requires labeled scores, Gaussian and non–Gaussian generative calibration models have been recently proposed. They not only have similar or better performance with respect to LogReg, but also allow for unsupervised or semi–supervised training of the models.The goal of this work is to extend these models. In particular, we show that normal variance–mean mixture distributions are able to model well–calibrated non–Gaussian distributed scores, provided that their parameters for the target and non–target score distributions are properly tied. As for the Gaussian case, a linear calibration model can then be estimated by computing Maximum Likelihood estimates of the distributions parameters and of the score transformation. The quality of all these approaches has been compared on a dataset of segments of variable duration obtained by cutting the NIST 2010 evaluation test data.
978-1-4799-8131-1
File in questo prodotto:
File Dimensione Formato  
ScoreCal_ICASSP19.pdf

non disponibili

Descrizione: Articolo principale
Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 273.69 kB
Formato Adobe PDF
273.69 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11583/2734222