Knowledge bases are nowadays essential components for any task that requires automation with some degrees of intelligence.Assessing the quality of a Knowledge Base (KB) is a complex task as it often means measuring the quality of structured information, ontologies and vocabularies, and queryable endpoints. Popular knowledge bases such as DBpedia, YAGO2, and Wikidata have chosen the RDF data model to represent their data due to its capabilities for semantically rich knowledge representation. Despite its advantages, there are challenges in using RDF data model, for example, data quality assessment and validation. In thispaper, we present a novel knowledge base quality assessment approach that relies on evolution analysis. The proposed approachuses data profiling on consecutive knowledge base releases to compute quality measures that allow detecting quality issues. Our quality characteristics are based on the KB evolution analysis and we used high-level change detection for measurement functions. In particular, we propose four quality characteristics: Persistency, Historical Persistency, Consistency, and Completeness.Persistency and historical persistency measures concern the degree of changes and lifespan of any entity type. Consistency andcompleteness measures identify properties with incomplete information and contradictory facts. The approach has been assessed both quantitatively and qualitatively on a series of releases from two knowledge bases, eleven releases of DBpedia and eight releases of 3cixty. The capability of Persistency and Consistency characteristics to detect quality issues varies significantly between the two case studies. Persistency measure gives observational results for evolving KBs. It is highly effective in case of KBwith periodic updates such as 3cixty KB. The Completeness characteristic is extremely effective and was able to achieve 95%precision in error detection for both use cases. The measures are based on simple statistical operations that make the solution both flexible and scalable

A Quality Assessment Approach for Evolving Knowledge Bases / Rashid, MOHAMMAD RIFAT AHMMAD; Torchiano, Marco; Rizzo, Giuseppe; Mihindukulasooriya, Nandana; Corcho, Oscar. - In: SEMANTIC WEB. - ISSN 2210-4968. - STAMPA. - 10:2(2019), pp. 349-383. [10.3233/SW-180324]

A Quality Assessment Approach for Evolving Knowledge Bases

Mohammad RIfat Ahmmad Rashid;Marco Torchiano;Giuseppe Rizzo;
2019

Abstract

Knowledge bases are nowadays essential components for any task that requires automation with some degrees of intelligence.Assessing the quality of a Knowledge Base (KB) is a complex task as it often means measuring the quality of structured information, ontologies and vocabularies, and queryable endpoints. Popular knowledge bases such as DBpedia, YAGO2, and Wikidata have chosen the RDF data model to represent their data due to its capabilities for semantically rich knowledge representation. Despite its advantages, there are challenges in using RDF data model, for example, data quality assessment and validation. In thispaper, we present a novel knowledge base quality assessment approach that relies on evolution analysis. The proposed approachuses data profiling on consecutive knowledge base releases to compute quality measures that allow detecting quality issues. Our quality characteristics are based on the KB evolution analysis and we used high-level change detection for measurement functions. In particular, we propose four quality characteristics: Persistency, Historical Persistency, Consistency, and Completeness.Persistency and historical persistency measures concern the degree of changes and lifespan of any entity type. Consistency andcompleteness measures identify properties with incomplete information and contradictory facts. The approach has been assessed both quantitatively and qualitatively on a series of releases from two knowledge bases, eleven releases of DBpedia and eight releases of 3cixty. The capability of Persistency and Consistency characteristics to detect quality issues varies significantly between the two case studies. Persistency measure gives observational results for evolving KBs. It is highly effective in case of KBwith periodic updates such as 3cixty KB. The Completeness characteristic is extremely effective and was able to achieve 95%precision in error detection for both use cases. The measures are based on simple statistical operations that make the solution both flexible and scalable
2019
File in questo prodotto:
File Dimensione Formato  
SWJ___A_Quality_Assessment_Approach_for_Evolving_Knowledge_Bases.pdf

accesso aperto

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 1.44 MB
Formato Adobe PDF
1.44 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2704228
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo