Analysis of ABC Frontend Audio Systems for the NIST-SRE24

Barahona, S.; Silnova, A.; Mosner, L.; Peng, J.; Plchot, O.; Rohdin, J.; Zhang, L.; Han, J.; Palka, P.; Landini, F.; Burget, L.; Stafylakis, T.; Cumani, S.; Bobos, D.; Hlavacek, M.; Kodovsky, M.; Pavlicek, T.

doi:10.21437/Interspeech.2025-2737

We present a comprehensive analysis of the embedding extractors (frontends) developed by the ABC team for the audio track of NIST SRE 2024. We follow the two scenarios imposed by NIST: using only a provided set of telephone recordings for training (fixed) or adding publicly available data (open condition). Under these constraints, we develop the best possible speaker embedding extractors for the pre-dominant conversational telephone speech (CTS) domain. We explored architectures based on ResNet with different pooling mechanisms, recently introduced ReDimNet architecture, as well as a system based on the XLS-R model, which represents the family of large pre-trained self-supervised models. In open condition, we train on VoxBlink2 dataset, containing 110 thousand speakers across multiple languages. We observed a good performance and robustness of VoxBlink-trained models, and our experiments show practical recipes for developing state-of-theart frontends for speaker recognition.

Analysis of ABC Frontend Audio Systems for the NIST-SRE24 / Barahona, S.; Silnova, A.; Mosner, L.; Peng, J.; Plchot, O.; Rohdin, J.; Zhang, L.; Han, J.; Palka, P.; Landini, F.; Burget, L.; Stafylakis, T.; Cumani, S.; Bobos, D.; Hlavacek, M.; Kodovsky, M.; Pavlicek, T.. - (2025), pp. 5763-5767. ( Interspeech 2025 Rotterdam (NL) 17 - 21 August 2025) [10.21437/Interspeech.2025-2737].

Analysis of ABC Frontend Audio Systems for the NIST-SRE24

Barahona S.;Silnova A.;Mosner L.;Peng J.;Plchot O.;Rohdin J.;Zhang L.;Han J.;Palka P.;Landini F.;Burget L.;Stafylakis T.;Cumani S.;Bobos D.;Hlavacek M.;Kodovsky M.;Pavlicek T.

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Anno del prodotto

2025

Appare nelle tipologie

4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
barahona25_interspeech.pdf accesso aperto Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Pubblico - Tutti i diritti riservati Dimensione 244.14 kB Formato Adobe PDF Visualizza/Apri	244.14 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3007721

PORTO @ Archivio Istituzionale della Ricerca

Analysis of ABC Frontend Audio Systems for the NIST-SRE24

Barahona S.;Silnova A.;Mosner L.;Peng J.;Plchot O.;Rohdin J.;Zhang L.;Han J.;Palka P.;Landini F.;Burget L.;Stafylakis T.;Cumani S.;Bobos D.;Hlavacek M.;Kodovsky M.;Pavlicek T.

2025

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)