Purpose: Scholarly publications are usually classified into document types (DTs), which are predefined categories outlining their nature (e.g., research articles, conference proceedings, reviews, short notes, letters, book chapters, etc.). This research presents a new semi-automated methodology to assess the accuracy of DT classification in bibliometric databases, such as Scopus and Web of Science (WoS). The methodology can handle a relatively large amount of documents (on the order of tens/hundreds of thousands) and is adaptable to the different classes of DTs covered by the databases in use, without requiring an a priori definition of a correspondence between their DTs. Methodological approach: The first phase of the proposed methodology is automated and exploits discrepancies in DT classifications by two competing databases (e.g., Scopus and WoS), in order to identify a subset of potentially misclassified documents, i.e., with possible DT-classification errors. The second phase involves the manual analysis of this subset of documents, resulting in the identification and attribution of DT-classification errors. The novel methodology is illustrated through a realistic application example. Findings: The methodology is shown to be effective in identifying DT-classification errors, suggesting a path to improve the quality and reliability of bibliometric databases. With reference to the application example provided, Scopus and WoS have overall error rates around 1.7% and 1.2%, respectively. A similar analysis based on a larger sample of documents is still in progress. Practical/social implications: By improving database accuracy, the academic community can benefit from more reliable bibliometric indicators, which can affect (at least to some extent) research funding, decision making and academic reputation.

Quality of Bibliometric Databases: Accuracy in Classification of Document Types / Maisano, DOMENICO AUGUSTO FRANCESCO; Mastrogiacomo, Luca; Ferrara, Lucrezia; Franceschini, Fiorenzo. - ELETTRONICO. - (2024), pp. 79-96. (Intervento presentato al convegno 6th International Conference on Quality Engineering and Management (ICQEM) tenutosi a Girona (Spain) nel 13-14 giugno 2024).

Quality of Bibliometric Databases: Accuracy in Classification of Document Types

Domenico Augusto Maisano;Luca Mastrogiacomo;Lucrezia Ferrara;Fiorenzo Franceschini
2024

Abstract

Purpose: Scholarly publications are usually classified into document types (DTs), which are predefined categories outlining their nature (e.g., research articles, conference proceedings, reviews, short notes, letters, book chapters, etc.). This research presents a new semi-automated methodology to assess the accuracy of DT classification in bibliometric databases, such as Scopus and Web of Science (WoS). The methodology can handle a relatively large amount of documents (on the order of tens/hundreds of thousands) and is adaptable to the different classes of DTs covered by the databases in use, without requiring an a priori definition of a correspondence between their DTs. Methodological approach: The first phase of the proposed methodology is automated and exploits discrepancies in DT classifications by two competing databases (e.g., Scopus and WoS), in order to identify a subset of potentially misclassified documents, i.e., with possible DT-classification errors. The second phase involves the manual analysis of this subset of documents, resulting in the identification and attribution of DT-classification errors. The novel methodology is illustrated through a realistic application example. Findings: The methodology is shown to be effective in identifying DT-classification errors, suggesting a path to improve the quality and reliability of bibliometric databases. With reference to the application example provided, Scopus and WoS have overall error rates around 1.7% and 1.2%, respectively. A similar analysis based on a larger sample of documents is still in progress. Practical/social implications: By improving database accuracy, the academic community can benefit from more reliable bibliometric indicators, which can affect (at least to some extent) research funding, decision making and academic reputation.
2024
978-989-54911-2-4
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2999235
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo