MWI-sum: A multilingual summarizer based on frequent weighted itemsets

Baralis, Elena Maria; Cagliero, Luca; Fiori, Alessandro; Garza, Paolo

doi:10.1145/2809786

Multidocument summarization addresses the selection of a compact subset of highly informative sentences, i.e., the summary, from a collection of textual documents. To perform sentence selection, two parallel strategies have been proposed: (a) apply general-purpose techniques relying on datamining or information retrieval techniques, and/or (b) perform advanced linguistic analysis relying on semantics-based models (e.g., ontologies) to capture the actual sentence meaning. Since there is an increasing need for processing documents written in different languages, the attention of the research community has recently focused on summarizers based on strategy (a). This article presents a novelmultilingual summarizer, namely MWI-Sum (Multilingual Weighted Itemsetbased Summarizer), that exploits an itemset-based model to summarize collections of documents ranging over the same topic. Unlike previous approaches, it extracts frequent weighted itemsets tailored to the analyzed collection and uses them to drive the sentence selection process. Weighted itemsets represent correlations among multiple highly relevant terms that are neglected by previous approaches. The proposed approach makes minimal use of language-dependent analyses. Thus, it is easily applicable to document collections written in different languages. Experiments performed on benchmark and real-life collections, English-written and not, demonstrate that the proposed approach performs better than state-of-the-art multilingual document summarizers.

MWI-sum: A multilingual summarizer based on frequent weighted itemsets / Baralis, E.M., Cagliero, L., Fiori, A., Garza, P.. - In: ACM TRANSACTIONS ON INFORMATION SYSTEMS. - ISSN 1046-8188. - 34:1(2015), pp. 1-35. [10.1145/2809786]

MWI-sum: A multilingual summarizer based on frequent weighted itemsets

BARALIS, ELENA MARIA;CAGLIERO, LUCA;FIORI, ALESSANDRO;GARZA, PAOLO

2015

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2015
			
	Codice DOI
	
				https://dx.doi.org/10.1145/2809786
			
	Titolo della Rivista
	
				ACM TRANSACTIONS ON INFORMATION SYSTEMS
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
2809786.pdf accesso riservato Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 626.23 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	626.23 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2623311

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

PORTO @ Archivio Istituzionale della Ricerca

MWI-sum: A multilingual summarizer based on frequent weighted itemsets

BARALIS, ELENA MARIA;CAGLIERO, LUCA;FIORI, ALESSANDRO;GARZA, PAOLO

2015

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Attenzione

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)