MALTO at SemEval-2025 Task 4: Dual Teachers for Unlearning Sensitive Content in LLMs

Savelli, Claudio; Evren, Munis; Bayat, Erfan; Andrea, Grieco; Giobergia, Flavio

Large language models (LLMs) may retain and reproduce sensitive information learned during training, posing significant privacy and ethical concerns. Once detected, this personal information should be deleted from the model. A naive answer could be to retrain these models from scratch when needed. However, this solution is unfeasible given the immense computational, economic, and environmental costs required to train these models. For this reason, Machine Unlearning (MU) has risen in recent years as an emerging field of research to efficiently delete specific information from a model’s knowledge. This paper presents our solution to the “Unlearning sensitive content from Large Language Models” shared task at SemEval-2025, which challenges researchers to develop effective LLM MU techniques. We adopt a Dual-Teacher framework that leverages a Competent and an Incompetent Teacher to erase unwanted information while selectively preserving model utility. Our approach adapts established computer vision unlearning methods to the sequential nature of language models through KL divergence minimization over next-token prediction probabilities. Our experimental results demonstrate that our method outperforms the state-of-the-art techniques.

MALTO at SemEval-2025 Task 4: Dual Teachers for Unlearning Sensitive Content in LLMs / Savelli, Claudio; Munis, Evren; Bayat, Erfan; Grieco, Andrea; Giobergia, Flavio. - (2025), pp. 1747-1752. (Intervento presentato al convegno 19th International Workshop on Semantic Evaluation (SemEval-2025) tenutosi a Vienna (AT) nel July 31 - August 1, 2025).

MALTO at SemEval-2025 Task 4: Dual Teachers for Unlearning Sensitive Content in LLMs

Savelli Claudio;Munis Evren;Bayat Erfan;Grieco Andrea;Giobergia Flavio

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2025
			
	Codice ISBN
	
				979-8-89176-273-2
			
	Appare nelle tipologie
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2025.semeval-1.229.pdf accesso aperto Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Creative commons Dimensione 212.53 kB Formato Adobe PDF Visualizza/Apri	212.53 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3002890

PORTO @ Archivio Istituzionale della Ricerca

MALTO at SemEval-2025 Task 4: Dual Teachers for Unlearning Sensitive Content in LLMs

Savelli Claudio;Munis Evren;Bayat Erfan;Grieco Andrea;Giobergia Flavio

2025

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)