Most of the state–of–the–art speaker recognition systems use i– vectors, a compact representation of spoken utterances. Since the “standard” i–vector extraction procedure requires large memory structures, we recently presented the Factorized Sub-space Estimation (FSE) approach, an efficient technique that dramatically reduces the memory needs for i–vector extraction, and is also fast and accurate compared to other proposed approaches. FSE is based on the approximation of the matrix T, representing the speaker variability sub–space, by means of the product of appropriately designed matrices. In this work, we introduce and evaluate a further approximation of the matrices that most contribute to the memory costs in the FSE approach, showing that it is possible to obtain comparable system accuracy using less than a half of FSE memory, which corresponds to more than 60 times memory reduction with respect to the standard method of i–vector extraction.

Memory-aware i-vector extraction by means of subspace factorization / Cumani, Sandro; Laface, Pietro. - STAMPA. - (2015), pp. 4669-4673. (Intervento presentato al convegno 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015 tenutosi a Brisbane (Australia) nel April 2015).

Memory-aware i-vector extraction by means of subspace factorization

CUMANI, SANDRO;LAFACE, Pietro
2015

Abstract

Most of the state–of–the–art speaker recognition systems use i– vectors, a compact representation of spoken utterances. Since the “standard” i–vector extraction procedure requires large memory structures, we recently presented the Factorized Sub-space Estimation (FSE) approach, an efficient technique that dramatically reduces the memory needs for i–vector extraction, and is also fast and accurate compared to other proposed approaches. FSE is based on the approximation of the matrix T, representing the speaker variability sub–space, by means of the product of appropriately designed matrices. In this work, we introduce and evaluate a further approximation of the matrices that most contribute to the memory costs in the FSE approach, showing that it is possible to obtain comparable system accuracy using less than a half of FSE memory, which corresponds to more than 60 times memory reduction with respect to the standard method of i–vector extraction.
2015
9781467369961
File in questo prodotto:
File Dimensione Formato  
factorized-oc-v7.pdf

accesso aperto

Tipologia: 1. Preprint / submitted version [pre- review]
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 90.37 kB
Formato Adobe PDF
90.37 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2589154
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo