In this work we present a novel generative approach for the score-level fusion of speaker verification systems. The proposed method employs a copula-based representation of the joint score distribution of multiple speaker recognizers that allows decoupling the dependency structure from the characterization of the marginal densities of the scores of different systems. This allows us to combine complex Variance-Gamma marginals with a simple Gaussian copula to obtain a characterization of the joint target and non-target score distribution that can be effectively employed for the score-level combination of multiple recognizers. Our results on NIST SRE 2019 and SITW datasets show that our approach is competitive with respect to state-of-the-art discriminative score fusion techniques, providing both accurate and well-calibrated scores, with a measured Cllr reduction of up to 7% relative with respect to discriminative linear fusion methods.

A copula-based generative score-level fusion model for speaker verification / Cumani, S.. - (2025), pp. 3723-3727. ( Interspeech 2025 Rotterdam (The Netherlands) 17 - 21 August 2025) [10.21437/Interspeech.2025-147].

A copula-based generative score-level fusion model for speaker verification

Cumani S.
2025

Abstract

In this work we present a novel generative approach for the score-level fusion of speaker verification systems. The proposed method employs a copula-based representation of the joint score distribution of multiple speaker recognizers that allows decoupling the dependency structure from the characterization of the marginal densities of the scores of different systems. This allows us to combine complex Variance-Gamma marginals with a simple Gaussian copula to obtain a characterization of the joint target and non-target score distribution that can be effectively employed for the score-level combination of multiple recognizers. Our results on NIST SRE 2019 and SITW datasets show that our approach is competitive with respect to state-of-the-art discriminative score fusion techniques, providing both accurate and well-calibrated scores, with a measured Cllr reduction of up to 7% relative with respect to discriminative linear fusion methods.
File in questo prodotto:
File Dimensione Formato  
cumani25b_interspeech.pdf

accesso aperto

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Pubblico - Tutti i diritti riservati
Dimensione 293.84 kB
Formato Adobe PDF
293.84 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3007719