Exploring legal documents such as laws, judgments, and contracts is known to be a time-consuming task. To support domain experts in efficiently browsing their contents, legal documents in electronic form are commonly enriched with semantic annotations. They consist of a list of headwords indicating the main topics. Annotations are commonly organized in taxonomies, which comprise both a set of is-a hierarchies, expressing parent/child-sibling relationships, and more arbitrary related-to semantic links. This paper addresses the use of Deep Learning-based Natural Language Processing techniques to automatically extract unknown taxonomy relationships between pairs of legal documents. Exploring the document content is particularly useful for automatically classifying legal document pairs when topic-level relationships are partly out-of-date or missing, which is quite common for related-to links. The experimental results, collected on a real heterogeneous collection of Italian legal documents, show that word-level vector representations of text are particularly effective in leveraging the presence of domain-specific terms for classification and overcome the limitations of contextualized embeddings when there is a lack of annotated data.

Automatic Inference of Taxonomy Relationships Among Legal Documents / Benedetto, Irene; Cagliero, Luca; Tarasconi, Francesco. - ELETTRONICO. - 1652:(2022), pp. 24-33. (Intervento presentato al convegno ADBIS 2022 26th European Conference on Advances in Databases and Information Systems tenutosi a Torino (IT) nel SEPTEMBER 5-8 2022) [10.1007/978-3-031-15743-1_3].

Automatic Inference of Taxonomy Relationships Among Legal Documents

Benedetto,Irene;Cagliero,Luca;
2022

Abstract

Exploring legal documents such as laws, judgments, and contracts is known to be a time-consuming task. To support domain experts in efficiently browsing their contents, legal documents in electronic form are commonly enriched with semantic annotations. They consist of a list of headwords indicating the main topics. Annotations are commonly organized in taxonomies, which comprise both a set of is-a hierarchies, expressing parent/child-sibling relationships, and more arbitrary related-to semantic links. This paper addresses the use of Deep Learning-based Natural Language Processing techniques to automatically extract unknown taxonomy relationships between pairs of legal documents. Exploring the document content is particularly useful for automatically classifying legal document pairs when topic-level relationships are partly out-of-date or missing, which is quite common for related-to links. The experimental results, collected on a real heterogeneous collection of Italian legal documents, show that word-level vector representations of text are particularly effective in leveraging the presence of domain-specific terms for classification and overcome the limitations of contextualized embeddings when there is a lack of annotated data.
2022
978-3-031-15742-4
978-3-031-15743-1
File in questo prodotto:
File Dimensione Formato  
ADBIS_2022___legal_classification (1).pdf

Open Access dal 30/08/2023

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Pubblico - Tutti i diritti riservati
Dimensione 421.24 kB
Formato Adobe PDF
421.24 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2971183