Purpose: Scholarly publications are usually classified into document types (DTs), which are predefined categories outlining their nature (e.g., research articles, conference proceedings, reviews, short notes, letters, book chapters, etc.). This research presents a new semi-automated methodology to assess the accuracy of DT classification in bibliometric databases, such as Scopus and Web of Science (WoS). The methodology can handle a relatively large amount of documents (on the order of tens/hundreds of thousands) and is adaptable to the different classes of DTs covered by the databases in use, without requiring an a priori definition of a correspondence between their DTs. Methodological approach: The first phase of the proposed methodology is automated and exploits discrepancies in DT classifications by two competing databases (e.g., Scopus and WoS), in order to identify a subset of potentially misclassified documents, i.e., with possible DT-classification errors. The second phase involves the manual analysis of this subset of documents, resulting in the identification and attribution of DT-classification errors. The novel methodology is illustrated through a realistic application example. Findings: The methodology is shown to be effective in identifying DT-classification errors, suggesting a path to improve the quality and reliability of bibliometric databases. With reference to the application example provided, Scopus and WoS have overall error rates around 1.7% and 1.2%, respectively. A similar analysis based on a larger sample of documents is still in progress. Practical/social implications: By improving database accuracy, the academic community can benefit from more reliable bibliometric indicators, which can affect (at least to some extent) research funding, decision making and academic reputation.
Quality of Bibliometric Databases: Accuracy in Classification of Document Types / Maisano, DOMENICO AUGUSTO FRANCESCO; Mastrogiacomo, Luca; Ferrara, Lucrezia; Franceschini, Fiorenzo. - ELETTRONICO. - (2024), pp. 79-96. (Intervento presentato al convegno 6th International Conference on Quality Engineering and Management (ICQEM) tenutosi a Girona (Spain) nel 13-14 giugno 2024).
Quality of Bibliometric Databases: Accuracy in Classification of Document Types
Domenico Augusto Maisano;Luca Mastrogiacomo;Lucrezia Ferrara;Fiorenzo Franceschini
2024
Abstract
Purpose: Scholarly publications are usually classified into document types (DTs), which are predefined categories outlining their nature (e.g., research articles, conference proceedings, reviews, short notes, letters, book chapters, etc.). This research presents a new semi-automated methodology to assess the accuracy of DT classification in bibliometric databases, such as Scopus and Web of Science (WoS). The methodology can handle a relatively large amount of documents (on the order of tens/hundreds of thousands) and is adaptable to the different classes of DTs covered by the databases in use, without requiring an a priori definition of a correspondence between their DTs. Methodological approach: The first phase of the proposed methodology is automated and exploits discrepancies in DT classifications by two competing databases (e.g., Scopus and WoS), in order to identify a subset of potentially misclassified documents, i.e., with possible DT-classification errors. The second phase involves the manual analysis of this subset of documents, resulting in the identification and attribution of DT-classification errors. The novel methodology is illustrated through a realistic application example. Findings: The methodology is shown to be effective in identifying DT-classification errors, suggesting a path to improve the quality and reliability of bibliometric databases. With reference to the application example provided, Scopus and WoS have overall error rates around 1.7% and 1.2%, respectively. A similar analysis based on a larger sample of documents is still in progress. Practical/social implications: By improving database accuracy, the academic community can benefit from more reliable bibliometric indicators, which can affect (at least to some extent) research funding, decision making and academic reputation.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2999235
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo