Exploring legal documents such as laws, judgments, and contracts is known to be a time-consuming task. To support domain experts in efficiently browsing their contents, legal documents in electronic form are commonly enriched with semantic annotations. They consist of a list of headwords indicating the main topics. Annotations are commonly organized in taxonomies, which comprise both a set of is-a hierarchies, expressing parent/child-sibling relationships, and more arbitrary related-to semantic links. This paper addresses the use of Deep Learning-based Natural Language Processing techniques to automatically extract unknown taxonomy relationships between pairs of legal documents. Exploring the document content is particularly useful for automatically classifying legal document pairs when topic-level relationships are partly out-of-date or missing, which is quite common for related-to links. The experimental results, collected on a real heterogeneous collection of Italian legal documents, show that word-level vector representations of text are particularly effective in leveraging the presence of domain-specific terms for classification and overcome the limitations of contextualized embeddings when there is a lack of annotated data.
Automatic Inference of Taxonomy Relationships Among Legal Documents / Benedetto, Irene; Cagliero, Luca; Tarasconi, Francesco. - ELETTRONICO. - 1652:(2022), pp. 24-33. (Intervento presentato al convegno ADBIS 2022 26th European Conference on Advances in Databases and Information Systems tenutosi a Torino (IT) nel SEPTEMBER 5-8 2022) [10.1007/978-3-031-15743-1_3].
Automatic Inference of Taxonomy Relationships Among Legal Documents
Benedetto,Irene;Cagliero,Luca;
2022
Abstract
Exploring legal documents such as laws, judgments, and contracts is known to be a time-consuming task. To support domain experts in efficiently browsing their contents, legal documents in electronic form are commonly enriched with semantic annotations. They consist of a list of headwords indicating the main topics. Annotations are commonly organized in taxonomies, which comprise both a set of is-a hierarchies, expressing parent/child-sibling relationships, and more arbitrary related-to semantic links. This paper addresses the use of Deep Learning-based Natural Language Processing techniques to automatically extract unknown taxonomy relationships between pairs of legal documents. Exploring the document content is particularly useful for automatically classifying legal document pairs when topic-level relationships are partly out-of-date or missing, which is quite common for related-to links. The experimental results, collected on a real heterogeneous collection of Italian legal documents, show that word-level vector representations of text are particularly effective in leveraging the presence of domain-specific terms for classification and overcome the limitations of contextualized embeddings when there is a lack of annotated data.File | Dimensione | Formato | |
---|---|---|---|
ADBIS_2022___legal_classification (1).pdf
Open Access dal 30/08/2023
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
421.24 kB
Formato
Adobe PDF
|
421.24 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2971183