Condensing the key message conveyed by a long document into an informative summary is particularly helpful to lawyers and legal experts. State-of-the-art approaches to legal document summarization rely on Language Models (LMs) and are mostly trained on English documents. More limited research efforts have been devoted to summarizing legal documents in languages other than English. In this work, we investigate the applicability of Large Language Models (LLMs) to summarize Italian legal news documents. We benchmark state-of-the-art abstractive summarization techniques based on Language Models, Large and not, for headline and abstract generation from legal news documents. We run an extensive set of experiments on a proprietary legal dataset, evaluating the resulting summaries according to both quantitative metrics and human evaluation. As expected, latest LLMs outperform classical models such as BART, T5, particularly in terms of grammaticality and informativeness of the summary content. Fine-tuned LLMs also show a significant increase in performance, variable across law areas, compared to their zero-shot setting. Importantly, the level of specialization of the fine-tuned version already reaches a steady state after feeding the model with few hundreds of training data.
Leveraging large language models for abstractive summarization of Italian legal news / Benedetto, Irene; Cagliero, Luca; Ferro, Michele; Tarasconi, Francesco; Bernini, Claudia; Giacalone, Giuseppe. - In: ARTIFICIAL INTELLIGENCE AND LAW. - ISSN 0924-8463. - (2025). [10.1007/s10506-025-09431-3]
Leveraging large language models for abstractive summarization of Italian legal news
Benedetto, Irene;Cagliero, Luca;
2025
Abstract
Condensing the key message conveyed by a long document into an informative summary is particularly helpful to lawyers and legal experts. State-of-the-art approaches to legal document summarization rely on Language Models (LMs) and are mostly trained on English documents. More limited research efforts have been devoted to summarizing legal documents in languages other than English. In this work, we investigate the applicability of Large Language Models (LLMs) to summarize Italian legal news documents. We benchmark state-of-the-art abstractive summarization techniques based on Language Models, Large and not, for headline and abstract generation from legal news documents. We run an extensive set of experiments on a proprietary legal dataset, evaluating the resulting summaries according to both quantitative metrics and human evaluation. As expected, latest LLMs outperform classical models such as BART, T5, particularly in terms of grammaticality and informativeness of the summary content. Fine-tuned LLMs also show a significant increase in performance, variable across law areas, compared to their zero-shot setting. Importantly, the level of specialization of the fine-tuned version already reaches a steady state after feeding the model with few hundreds of training data.File | Dimensione | Formato | |
---|---|---|---|
s10506-025-09431-3.pdf
accesso riservato
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
1.08 MB
Formato
Adobe PDF
|
1.08 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2997743