Most current state-of-the-art text-independent speaker recognition systems are based on i-vectors, and on probabilistic linear discriminant analysis (PLDA). PLDA assumes that the i-vectors of a trial are homogeneous, i.e., that they have been extracted by the same system. In other words, the enrollment and test i-vectors belong to the same class. However, it is sometimes important to score trials including “heterogeneous” i-vectors, for instance, enrollment i-vectors extracted by an old system, and test i-vectors extracted by a newer, more accurate, system. In this paper, we introduce a PLDA model that is able to score heterogeneous i-vectors independent of their extraction approach, dimensions, and any other characteristics that make a set of i-vectors of the same speaker belong to different classes. The new model, which will be referred to as nonlinear tied-PLDA (NL-Tied-PLDA), is obtained by a generalization of our recently proposed nonlinear PLDA approach, which jointly estimates the PLDA parameters and the parameters of a nonlinear transformation of the i-vectors. The generalization consists of estimating a class-dependent nonlinear transformation of the i-vectors, with the constraint that the transformed i-vectors of the same speaker share the same speaker factor. The resulting model is flexible and accurate, as assessed by the results of a set of experiments performed on the extended core NIST SRE 2012 evaluation. In particular, NL-Tied-PLDA provides better results on heterogeneous trials with respect to the corresponding homogeneous trials scored by the old system, and, in some configurations, it also reaches the accuracy of the new system. Similar results were obtained on the female-extended core NIST SRE 2010 telephone condition.
Scoring heterogeneous speaker vectors using nonlinear transformations and tied PLDa models / Cumani, Sandro; Laface, Pietro. - In: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. - ISSN 2329-9290. - STAMPA. - 26:5(2018), pp. 995-1009. [10.1109/TASLP.2018.2806305]
Scoring heterogeneous speaker vectors using nonlinear transformations and tied PLDa models
Cumani, Sandro;Laface, Pietro
2018
Abstract
Most current state-of-the-art text-independent speaker recognition systems are based on i-vectors, and on probabilistic linear discriminant analysis (PLDA). PLDA assumes that the i-vectors of a trial are homogeneous, i.e., that they have been extracted by the same system. In other words, the enrollment and test i-vectors belong to the same class. However, it is sometimes important to score trials including “heterogeneous” i-vectors, for instance, enrollment i-vectors extracted by an old system, and test i-vectors extracted by a newer, more accurate, system. In this paper, we introduce a PLDA model that is able to score heterogeneous i-vectors independent of their extraction approach, dimensions, and any other characteristics that make a set of i-vectors of the same speaker belong to different classes. The new model, which will be referred to as nonlinear tied-PLDA (NL-Tied-PLDA), is obtained by a generalization of our recently proposed nonlinear PLDA approach, which jointly estimates the PLDA parameters and the parameters of a nonlinear transformation of the i-vectors. The generalization consists of estimating a class-dependent nonlinear transformation of the i-vectors, with the constraint that the transformed i-vectors of the same speaker share the same speaker factor. The resulting model is flexible and accurate, as assessed by the results of a set of experiments performed on the extended core NIST SRE 2012 evaluation. In particular, NL-Tied-PLDA provides better results on heterogeneous trials with respect to the corresponding homogeneous trials scored by the old system, and, in some configurations, it also reaches the accuracy of the new system. Similar results were obtained on the female-extended core NIST SRE 2010 telephone condition.File | Dimensione | Formato | |
---|---|---|---|
18_Heterogeneous-postprint.pdf
accesso aperto
Descrizione: Articolo principale
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
PUBBLICO - Tutti i diritti riservati
Dimensione
197.16 kB
Formato
Adobe PDF
|
197.16 kB | Adobe PDF | Visualizza/Apri |
08293802.pdf
non disponibili
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
333.83 kB
Formato
Adobe PDF
|
333.83 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2707141
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo