Automated text summarization is particularly important in the legal domain due to the length and inherent complexity of the analyzed documents. The Legal AI community has already started to address the text summarization problem. However, most existing approaches focus on English-written documents. Up to now, limited efforts have been devoted to summarizing Italian legal documents. Existing approaches extract portions of existing content without rephrasing them. To bridge this gap, in this work we aim at generating abstractive summaries of Italian legal news. We propose to condense the original news content into different summary types, i.e., an abstract, a title, or a subheader. We benchmark different state-of-the-art summarization models to generate abstractive summaries of Italian legal news. We also investigate the suitability of augmented models capable of handling long Italian documents. The experimental results achieved on a proprietary Italian dataset show the effectiveness of abstractive models in generating fairly accurate summaries and the importance of using larger contextual windows to generate news abstracts.

Benchmarking Abstractive Models for Italian Legal News Summarization / Benedetto, Irene; Cagliero, Luca; Tarasconi, Francesco; Giacalone, Giuseppe; Bernini, Claudia. - ELETTRONICO. - 379:(2023), pp. 311-316. (Intervento presentato al convegno JURIX2023: 36th International Conference on Legal Knowledge and Information Systems tenutosi a Maastricht (NLD) nel 18-20 December 2023) [10.3233/faia230980].

Benchmarking Abstractive Models for Italian Legal News Summarization

Benedetto, Irene;Cagliero, Luca;
2023

Abstract

Automated text summarization is particularly important in the legal domain due to the length and inherent complexity of the analyzed documents. The Legal AI community has already started to address the text summarization problem. However, most existing approaches focus on English-written documents. Up to now, limited efforts have been devoted to summarizing Italian legal documents. Existing approaches extract portions of existing content without rephrasing them. To bridge this gap, in this work we aim at generating abstractive summaries of Italian legal news. We propose to condense the original news content into different summary types, i.e., an abstract, a title, or a subheader. We benchmark different state-of-the-art summarization models to generate abstractive summaries of Italian legal news. We also investigate the suitability of augmented models capable of handling long Italian documents. The experimental results achieved on a proprietary Italian dataset show the effectiveness of abstractive models in generating fairly accurate summaries and the importance of using larger contextual windows to generate news abstracts.
2023
9781643684727
9781643684734
File in questo prodotto:
File Dimensione Formato  
FAIA-379-FAIA230980.pdf

accesso aperto

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Creative commons
Dimensione 177.11 kB
Formato Adobe PDF
177.11 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2987349