Most of the state–of–the–art speaker recognition systems use i– vectors, a compact representation of spoken utterances. Since the “standard” i–vector extraction procedure requires large memory structures, we recently presented the Factorized Sub-space Estimation (FSE) approach, an efficient technique that dramatically reduces the memory needs for i–vector extraction, and is also fast and accurate compared to other proposed approaches. FSE is based on the approximation of the matrix T, representing the speaker variability sub–space, by means of the product of appropriately designed matrices. In this work, we introduce and evaluate a further approximation of the matrices that most contribute to the memory costs in the FSE approach, showing that it is possible to obtain comparable system accuracy using less than a half of FSE memory, which corresponds to more than 60 times memory reduction with respect to the standard method of i–vector extraction.
Memory-aware i-vector extraction by means of subspace factorization / Cumani, Sandro; Laface, Pietro. - STAMPA. - (2015), pp. 4669-4673. (Intervento presentato al convegno 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015 tenutosi a Brisbane (Australia) nel April 2015).
Memory-aware i-vector extraction by means of subspace factorization
CUMANI, SANDRO;LAFACE, Pietro
2015
Abstract
Most of the state–of–the–art speaker recognition systems use i– vectors, a compact representation of spoken utterances. Since the “standard” i–vector extraction procedure requires large memory structures, we recently presented the Factorized Sub-space Estimation (FSE) approach, an efficient technique that dramatically reduces the memory needs for i–vector extraction, and is also fast and accurate compared to other proposed approaches. FSE is based on the approximation of the matrix T, representing the speaker variability sub–space, by means of the product of appropriately designed matrices. In this work, we introduce and evaluate a further approximation of the matrices that most contribute to the memory costs in the FSE approach, showing that it is possible to obtain comparable system accuracy using less than a half of FSE memory, which corresponds to more than 60 times memory reduction with respect to the standard method of i–vector extraction.File | Dimensione | Formato | |
---|---|---|---|
factorized-oc-v7.pdf
accesso aperto
Tipologia:
1. Preprint / submitted version [pre- review]
Licenza:
PUBBLICO - Tutti i diritti riservati
Dimensione
90.37 kB
Formato
Adobe PDF
|
90.37 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2589154
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo