Visual-Word Sense Disambiguation (V-WSD) entails resolving the linguistic ambiguity in a text by selecting a clarifying image from a set of (potentially misleading) candidates. In this paper, we address V-WSD using a state-of-the-art Image-Text Retrieval system, namely CLIP. We propose to alleviate the linguistic ambiguity across multiple domains and languages via text and image augmentation. To augment the textual content we rely on backtranslation with the aid of a variety of auxiliary languages. The approach based on fine-tuning CLIP on the full phrases is effective in accurately disambiguating words and incorporating back-translation enhances the system’s robustness and performance on the test samples written in Indo-European languages.
PoliTo at SemEval-2023 Task 1: CLIP-based Visual-Word Sense Disambiguation Based on Back-Translation / Vaiani, Lorenzo; Cagliero, Luca; Garza, Paolo. - ELETTRONICO. - (2023), pp. 1447-1453. (Intervento presentato al convegno SemEval-2023 (Workshop of ACL) tenutosi a Toronto (CAN) nel July 9–14, 2023) [10.18653/v1/2023.semeval-1.199].
PoliTo at SemEval-2023 Task 1: CLIP-based Visual-Word Sense Disambiguation Based on Back-Translation
Lorenzo Vaiani;Luca Cagliero;Paolo Garza
2023
Abstract
Visual-Word Sense Disambiguation (V-WSD) entails resolving the linguistic ambiguity in a text by selecting a clarifying image from a set of (potentially misleading) candidates. In this paper, we address V-WSD using a state-of-the-art Image-Text Retrieval system, namely CLIP. We propose to alleviate the linguistic ambiguity across multiple domains and languages via text and image augmentation. To augment the textual content we rely on backtranslation with the aid of a variety of auxiliary languages. The approach based on fine-tuning CLIP on the full phrases is effective in accurately disambiguating words and incorporating back-translation enhances the system’s robustness and performance on the test samples written in Indo-European languages.File | Dimensione | Formato | |
---|---|---|---|
2023.semeval-1.199.pdf
accesso aperto
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Creative commons
Dimensione
153.47 kB
Formato
Adobe PDF
|
153.47 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2982327