Timeline summarization methods analyze timestamped, topic-specific news article collections to select the key dates representing the event flow and to extract the most relevant per-date content. Existing approaches are all tailored to a single language. Hence, they are unable to combine topic-related content available in different languages. Enriching news timelines with multilingual content is particularly useful for (i) summarizing complex events, whose main facets are covered differently by media sources from different countries, and (ii) generating news timelines in low-resource languages, for which there is a lack of news content in the target language. This paper presents three alternative approaches to address cross-lingual timeline summarization. They combine state-of-the-art extractive summarization methods with machine translation steps at different stages of the timeline generation process. The paper also proposes novel Rouge-based evaluation metrics customized for cross-lingual timeline summarization with a twofold aim: (i) quantifying the ability of the cross-lingual process to enhance available content extraction in the target language and (ii) estimating summarizer effectiveness in conveying additional content from other languages. A new multilingual timeline benchmark dataset has been generated to allow a thorough analysis of the factors that mainly influence summarization performance.

Cross-lingual timeline summarization / Cagliero, Luca; LA QUATRA, Moreno; Garza, Paolo; Baralis, ELENA MARIA. - ELETTRONICO. - (2021), pp. 45-53. ((Intervento presentato al convegno 2021 IEEE Fourth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE) tenutosi a Virtual, Online nel 1-3 December 2021 [10.1109/AIKE52691.2021.00014].

Cross-lingual timeline summarization

Luca Cagliero;Moreno La Quatra;Paolo Garza;Elena Baralis
2021

Abstract

Timeline summarization methods analyze timestamped, topic-specific news article collections to select the key dates representing the event flow and to extract the most relevant per-date content. Existing approaches are all tailored to a single language. Hence, they are unable to combine topic-related content available in different languages. Enriching news timelines with multilingual content is particularly useful for (i) summarizing complex events, whose main facets are covered differently by media sources from different countries, and (ii) generating news timelines in low-resource languages, for which there is a lack of news content in the target language. This paper presents three alternative approaches to address cross-lingual timeline summarization. They combine state-of-the-art extractive summarization methods with machine translation steps at different stages of the timeline generation process. The paper also proposes novel Rouge-based evaluation metrics customized for cross-lingual timeline summarization with a twofold aim: (i) quantifying the ability of the cross-lingual process to enhance available content extraction in the target language and (ii) estimating summarizer effectiveness in conveying additional content from other languages. A new multilingual timeline benchmark dataset has been generated to allow a thorough analysis of the factors that mainly influence summarization performance.
978-1-6654-3736-3
File in questo prodotto:
File Dimensione Formato  
IEEE_AIKE_2021.pdf

accesso aperto

Descrizione: Post-print version of the manuscript
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 418.78 kB
Formato Adobe PDF
418.78 kB Adobe PDF Visualizza/Apri
Cross-lingual_timeline_summarization.pdf

non disponibili

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 923.04 kB
Formato Adobe PDF
923.04 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2945352