Transformer-based cross-encoders, such as MonoT5, achieve state-of-the-art performance in several retrieval tasks. In particular, MonoT5 is based on a sequence-to-sequence architecture and trained on MS MARCO for passage re-ranking, where it predicts the relevance of a passage given an input query. In this paper, to understand what MonoT5 has learned, we analyse the parameter updates during its training process. We observe that the largest shifts occur in a small set of parameters, i.e. less than 1% of the model, while the rest of the model remains relatively unchanged. Motivated by this finding, we propose Light-MonoT5, a parameter-efficient variant of MonoT5 that updates only this small set of parameters during training, and leaves the rest of the network unchanged. Extensive evaluation on both in- and out-domain benchmarks shows that Light-MonoT5 achieves statistically equivalent effectiveness compared to MonoT5. Since relevance can be captured by updating only a subset of T5 parameters, we hypothesise that MonoT5, which updates all the original model’s parameters, primarily learns to evaluate passage quality rather than explicitly assessing the relevance of a passage to the query. To test our hypothesis, we employ QT5, a T5-based quality estimation model, to prune low-quality passages before indexing. On the pruned collection, Light-MonoT5 achieves performance on par with MonoT5, indicating that MonoT5’s strong performance is largely attributable to quality assessment, with minimal adaptation required once low-quality content is removed.

Revealing MonoT5's Learning Mechanisms via Prompt-Token Adaptation / Braga, Marco; Macavaney, Sean; Macdonald, Craig; Pasi, Gabriella. - 16483:(2026), pp. 450-465. ( 48th European Conference on Information Retrieval (ECIR 2026) Delft (NL) March 29 – April 2, 2026) [10.1007/978-3-032-21289-4_29].

Revealing MonoT5's Learning Mechanisms via Prompt-Token Adaptation

Braga, Marco;
2026

Abstract

Transformer-based cross-encoders, such as MonoT5, achieve state-of-the-art performance in several retrieval tasks. In particular, MonoT5 is based on a sequence-to-sequence architecture and trained on MS MARCO for passage re-ranking, where it predicts the relevance of a passage given an input query. In this paper, to understand what MonoT5 has learned, we analyse the parameter updates during its training process. We observe that the largest shifts occur in a small set of parameters, i.e. less than 1% of the model, while the rest of the model remains relatively unchanged. Motivated by this finding, we propose Light-MonoT5, a parameter-efficient variant of MonoT5 that updates only this small set of parameters during training, and leaves the rest of the network unchanged. Extensive evaluation on both in- and out-domain benchmarks shows that Light-MonoT5 achieves statistically equivalent effectiveness compared to MonoT5. Since relevance can be captured by updating only a subset of T5 parameters, we hypothesise that MonoT5, which updates all the original model’s parameters, primarily learns to evaluate passage quality rather than explicitly assessing the relevance of a passage to the query. To test our hypothesis, we employ QT5, a T5-based quality estimation model, to prune low-quality passages before indexing. On the pruned collection, Light-MonoT5 achieves performance on par with MonoT5, indicating that MonoT5’s strong performance is largely attributable to quality assessment, with minimal adaptation required once low-quality content is removed.
2026
9783032212887
9783032212894
File in questo prodotto:
File Dimensione Formato  
978-3-032-21289-4_29.pdf

accesso riservato

Descrizione: Articolo Pubblicato
Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 612.3 kB
Formato Adobe PDF
612.3 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3009793