Condensing the key message conveyed by a long document into an informative summary is particularly helpful to lawyers and legal experts. State-of-the-art approaches to legal document summarization rely on Language Models (LMs) and are mostly trained on English documents. More limited research efforts have been devoted to summarizing legal documents in languages other than English. In this work, we investigate the applicability of Large Language Models (LLMs) to summarize Italian legal news documents. We benchmark state-of-the-art abstractive summarization techniques based on Language Models, Large and not, for headline and abstract generation from legal news documents. We run an extensive set of experiments on a proprietary legal dataset, evaluating the resulting summaries according to both quantitative metrics and human evaluation. As expected, latest LLMs outperform classical models such as BART, T5, particularly in terms of grammaticality and informativeness of the summary content. Fine-tuned LLMs also show a significant increase in performance, variable across law areas, compared to their zero-shot setting. Importantly, the level of specialization of the fine-tuned version already reaches a steady state after feeding the model with few hundreds of training data.

Leveraging large language models for abstractive summarization of Italian legal news / Benedetto, Irene; Cagliero, Luca; Ferro, Michele; Tarasconi, Francesco; Bernini, Claudia; Giacalone, Giuseppe. - In: ARTIFICIAL INTELLIGENCE AND LAW. - ISSN 0924-8463. - (2025). [10.1007/s10506-025-09431-3]

Leveraging large language models for abstractive summarization of Italian legal news

Benedetto, Irene;Cagliero, Luca;
2025

Abstract

Condensing the key message conveyed by a long document into an informative summary is particularly helpful to lawyers and legal experts. State-of-the-art approaches to legal document summarization rely on Language Models (LMs) and are mostly trained on English documents. More limited research efforts have been devoted to summarizing legal documents in languages other than English. In this work, we investigate the applicability of Large Language Models (LLMs) to summarize Italian legal news documents. We benchmark state-of-the-art abstractive summarization techniques based on Language Models, Large and not, for headline and abstract generation from legal news documents. We run an extensive set of experiments on a proprietary legal dataset, evaluating the resulting summaries according to both quantitative metrics and human evaluation. As expected, latest LLMs outperform classical models such as BART, T5, particularly in terms of grammaticality and informativeness of the summary content. Fine-tuned LLMs also show a significant increase in performance, variable across law areas, compared to their zero-shot setting. Importantly, the level of specialization of the fine-tuned version already reaches a steady state after feeding the model with few hundreds of training data.
File in questo prodotto:
File Dimensione Formato  
s10506-025-09431-3.pdf

accesso riservato

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 1.08 MB
Formato Adobe PDF
1.08 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2997743