The Gaussian probabilistic linear discriminant anal-ysis (PLDA) model assumes Gaussian distributed priors for the latent variables that represent the speaker and channel factors. Assuming that each training i-vector belongs to a different speaker, as is usually done in i-vector extraction, i-vectors generated by a PLDA model can be considered independent and identically distributed with Gaussian distribution. Thus, we have recently proposed to transform the development i-vectors so that their distribution becomes more Gaussian-like. This is obtained by means of a sequence of affine and nonlinear transformations whose parameters are trained by maximum likelihood (ML) estimation on the development set. The evaluation i-vectors are then subject to the same transformation. Although the i-vector “gaussianization” has shown to be effective, since the i-vectors extracted from segments of the same speaker are not independent, the original assumption is not satisfactory. In this work, we show that the model can be improved by properly exploiting the information about the speaker labels, which was ignored in the previous model. In particular, a more effective PLDA model can be obtained by jointly estimating the PLDA parameters and the parameters of the nonlinear transformation of the i-vectors. In other words, while the goal of the previous approach was to “gaussianize” the training i-vectors distribution, the objective of this work is to embed the estimation of the nonlinear i-vector transformation in the PLDA model estimation. We will thus refer to this model as the nonlinear PLDA model. We show that this new approach provides significant gain with respect to PLDA, and a small, yet consistent, improvement with respect to our former i-vector “gaussianization” approach, without further additional costs.
Joint estimation of PLDA and nonlinear transformations of speaker vectors / CUMANI, SANDRO; LAFACE, Pietro. - In: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. - ISSN 2329-9290. - STAMPA. - 25:10(2017), pp. 1890-1900.
|Titolo:||Joint estimation of PLDA and nonlinear transformations of speaker vectors|
|Data di pubblicazione:||2017|
|Digital Object Identifier (DOI):||http://dx.doi.org/10.1109/TASLP.2017.2724198|
|Appare nelle tipologie:||1.1 Articolo in rivista|
File in questo prodotto:
|FINAL VERSION.PDF||Articolo principale||2. Post-print / Author's Accepted Manuscript||PUBBLICO - Tutti i diritti riservati||Visibile a tuttiVisualizza/Apri|
|07971950.pdf||2a Post-print versione editoriale / Version of Record||Non Pubblico - Accesso privato/ristretto||Administrator Richiedi una copia|