This work aims at reducing the memory demand of the data structures that are usually pre–computed and stored for fast computation of the i-vectors, a compact representation of spoken utterances that is used by most state–of–the–art speaker recognition systems. We propose two new approaches allowing accurate i-vector extraction but requiring less memory, showing their relations with the standard computation method introduced for eigenvoices, and with the recently proposed fast eigen–decomposition technique. The first approach computes an i–vector in a Variational Bayes (VB) framework by iterating the estimation of one sub–block of i–vector elements at a time, keeping fixed all the others, and can obtain i–vectors as accurate as the ones obtained by the standard technique but requiring only 25% of its memory. The second technique is based on the Conjugate Gradient solution of a linear system, which is accurate and uses even less memory, but is slower than the VB approach. We analyze and compare the time and memory resources required by all these solutions, which are suited to different applications, and we show that it is possible to get accurate results greatly reducing memory demand compared with the standard solution at almost the same speed.

Memory and computation trade-offs for efficient i-vector extraction / Cumani, Sandro; Laface, Pietro. - In: IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. - ISSN 1558-7916. - STAMPA. - 21:5(2013), pp. 934-944. [10.1109/TASL.2013.2239291]

Memory and computation trade-offs for efficient i-vector extraction

CUMANI, SANDRO;LAFACE, Pietro
2013

Abstract

This work aims at reducing the memory demand of the data structures that are usually pre–computed and stored for fast computation of the i-vectors, a compact representation of spoken utterances that is used by most state–of–the–art speaker recognition systems. We propose two new approaches allowing accurate i-vector extraction but requiring less memory, showing their relations with the standard computation method introduced for eigenvoices, and with the recently proposed fast eigen–decomposition technique. The first approach computes an i–vector in a Variational Bayes (VB) framework by iterating the estimation of one sub–block of i–vector elements at a time, keeping fixed all the others, and can obtain i–vectors as accurate as the ones obtained by the standard technique but requiring only 25% of its memory. The second technique is based on the Conjugate Gradient solution of a linear system, which is accurate and uses even less memory, but is slower than the VB approach. We analyze and compare the time and memory resources required by all these solutions, which are suited to different applications, and we show that it is possible to get accurate results greatly reducing memory demand compared with the standard solution at almost the same speed.
File in questo prodotto:
File Dimensione Formato  
Memory and computation trade-offs for efficient i-vector extraction.pdf

accesso aperto

Descrizione: Articolo principale
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 205.34 kB
Formato Adobe PDF
205.34 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2505675
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo