A large‑scale semi‑automated approach for assessing document‑type classifcation errors in bibliometric databases

Maisano, DOMENICO AUGUSTO FRANCESCO; Mastrogiacomo, Luca; Ferrara, Lucrezia; Franceschini, Fiorenzo

doi:10.1007/s11192-025-05244-y

The accuracy of bibliometric databases in classifying document types (DTs)—such as research articles, conference proceedings, reviews, short notes, letters, book chapters, etc.—is crucial for the academic community, as bibliometric indicators may signifcantly infuence research funding, decision-making, and academic reputation. This study presents a semi-automated methodology to assess the accuracy of DT classifcation in bibliometric databases, such as Scopus and Web of Science (WoS). The methodology can handle large document volumes and adapt to diferent DT categories without predefned correspondences. The frst phase of the methodology automatically identifes discrepancies in DT classifcations between Scopus and WoS, in order to fnd potentially misclassifed documents; the second phase involves manually analyzing these documents to confrm and attribute classifcation errors. The methodology is applied to a sample of several tens of thousands of papers from the teaching staf of two major universities in Turin (Italy). The results show overall error rates of approximately 2.7% for Scopus and 2.3% for WoS. The paper also analyzes the most common types of errors found in both databases, providing an interpretation of these inaccuracies and some insights for possible improvements in the quality of these databases.

A large‑scale semi‑automated approach for assessing document‑type classifcation errors in bibliometric databases / Maisano, DOMENICO AUGUSTO FRANCESCO; Mastrogiacomo, Luca; Ferrara, Lucrezia; Franceschini, Fiorenzo. - In: SCIENTOMETRICS. - ISSN 0138-9130. - STAMPA. - 130:3(2025), pp. 1901-1938. [10.1007/s11192-025-05244-y]

A large‑scale semi‑automated approach for assessing document‑type classifcation errors in bibliometric databases

Domenico, Maisano;Luca Mastrogiacomo;Lucrezia, Ferrara;Fiorenzo, Franceschini

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2025
			
	Codice DOI
	
				https://dx.doi.org/10.1007/s11192-025-05244-y
			
	Titolo della Rivista
	
				SCIENTOMETRICS

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2999016

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

PORTO @ Archivio Istituzionale della Ricerca

A large‑scale semi‑automated approach for assessing document‑type classifcation errors in bibliometric databases

Domenico, Maisano;Luca Mastrogiacomo;Lucrezia, Ferrara;Fiorenzo, Franceschini

2025

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Attenzione

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)