MALTO at SemEval-2025 task 3: Detecting hallucinations in LLMs via uncertainty quantification and larger model validation

Savelli, Claudio; Koudounas, Alkis; Giobergia, Flavio

Large language models (LLMs) often produce {textit{hallucinations}} —factually incorrect statements that appear highly persuasive. These errors pose risks in fields like healthcare, law, and journalism. This paper presents our approach to the Mu-SHROOM shared task at SemEval 2025, which challenges researchers to detect hallucination spans in LLM outputs. We introduce a new method that combines probability-based analysis with Natural Language Inference to evaluate hallucinations at the word level. Our technique aims to better align with human judgments while working independently of the underlying model. Our experimental results demonstrate the effectiveness of this method compared to existing baselines.

MALTO at SemEval-2025 task 3: Detecting hallucinations in LLMs via uncertainty quantification and larger model validation / Savelli, Claudio; Koudounas, Alkis; Giobergia, Flavio. - (2025), pp. 1318-1324. (Intervento presentato al convegno 19th International Workshop on Semantic Evaluation (SemEval-2025) tenutosi a Vienna (AT) nel July 31 - August 1, 2025).

MALTO at SemEval-2025 task 3: Detecting hallucinations in LLMs via uncertainty quantification and larger model validation

Savelli Claudio;Koudounas Alkis;Giobergia Flavio

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2025
			
	Codice ISBN
	
				979-8-89176-273-2
			
	Appare nelle tipologie
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2025.semeval-1.175.pdf accesso aperto Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Creative commons Dimensione 230.24 kB Formato Adobe PDF Visualizza/Apri	230.24 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3002891

PORTO @ Archivio Istituzionale della Ricerca

MALTO at SemEval-2025 task 3: Detecting hallucinations in LLMs via uncertainty quantification and larger model validation

Savelli Claudio;Koudounas Alkis;Giobergia Flavio

2025

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)