To cope with the growing volume, complexity, and articulation of legal documents as well as to foster digital justice and digital law, increasing effort is being devoted to legal knowledge extraction and digital transformation processes. In this paper, we present the ASKE (Automated System for Knowledge Extraction) approach to legal knowledge extraction, based on a combination of context-aware embedding models and zero-shot learning techniques into a three-phase extraction cycle, which is executed a number of times (called generations) to progressively extract concepts representative of the different meanings of terminology used in legal documents chunks. A graph-based data structure called ASKE Conceptual Graph is initially populated through a data preparation step, and it is continuously enriched at each ASKE generation with results of document chunk classification, new extracted terminology, and newly derived concepts. A quantitative evaluation of ASKE knowledge extraction and document classification is provided by considering the EurLex dataset. Furthermore, we present the results of applying ASKE to a real case-study of Italian case law decisions with qualitative feedback from legal experts in the framework of an ongoing national research project.
Enforcing legal information extraction through context-aware techniques: The ASKE approach / Castano, S.; Ferrara, A.; Furiosi, E.; Montanelli, S.; Picascia, S.; Riva, D.; Stefanetti, C.. - In: COMPUTER LAW & SECURITY REVIEW. - ISSN 2212-473X. - 52:(2024). [10.1016/j.clsr.2023.105903]
Enforcing legal information extraction through context-aware techniques: The ASKE approach
Riva D.;
2024
Abstract
To cope with the growing volume, complexity, and articulation of legal documents as well as to foster digital justice and digital law, increasing effort is being devoted to legal knowledge extraction and digital transformation processes. In this paper, we present the ASKE (Automated System for Knowledge Extraction) approach to legal knowledge extraction, based on a combination of context-aware embedding models and zero-shot learning techniques into a three-phase extraction cycle, which is executed a number of times (called generations) to progressively extract concepts representative of the different meanings of terminology used in legal documents chunks. A graph-based data structure called ASKE Conceptual Graph is initially populated through a data preparation step, and it is continuously enriched at each ASKE generation with results of document chunk classification, new extracted terminology, and newly derived concepts. A quantitative evaluation of ASKE knowledge extraction and document classification is provided by considering the EurLex dataset. Furthermore, we present the results of applying ASKE to a real case-study of Italian case law decisions with qualitative feedback from legal experts in the framework of an ongoing national research project.File | Dimensione | Formato | |
---|---|---|---|
1-s2.0-S0267364923001139-main.pdf
accesso aperto
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Creative commons
Dimensione
1.49 MB
Formato
Adobe PDF
|
1.49 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2992899