Omitted citations – i.e., missing links between a cited paper and the corresponding citing papers – are the main consequence of several bibliometric-database errors. To reduce these errors, databases may undertake two actions: (i) improving the control of the (new) papers to be indexed, i.e., limiting the introduction of “new” dirty data, and (ii) detecting and correcting errors in the papers already indexed by the database, i.e., cleaning “old” dirty data. The latter action is probably more complicated, as it requires the application of suitable error-detection procedures to a huge amount of data. Based on an extensive sample of scientific papers in the Engineering-Manufacturing field, this study focuses on old dirty data in the Scopus and WoS databases. To this purpose, a recent automated algorithm for estimating the omitted-citation rate of databases is applied to the same sample of papers, but in three different-time sessions. A database’s ability to clean the old dirty data is evaluated considering the variations in the omitted-citation rate from session to session. The major outcomes of this study are that: (i) both databases slowly correct old omitted citations, and (ii) a small portion of initially corrected citations can surprisingly come off from databases over time.
On the correction of “old” omitted citations by bibliometric databases / Franceschini, Fiorenzo; Maisano, DOMENICO AUGUSTO FRANCESCO; Mastrogiacomo, Luca. - ELETTRONICO. - (2015), pp. 1200-1207. (Intervento presentato al convegno 15th ISSI (International Society of Scientometrics and Informetrics Conference) 2015 tenutosi a Istanbul, Turkey nel 29 June - 4 July 2015).
On the correction of “old” omitted citations by bibliometric databases
FRANCESCHINI, FIORENZO;MAISANO, DOMENICO AUGUSTO FRANCESCO;MASTROGIACOMO, LUCA
2015
Abstract
Omitted citations – i.e., missing links between a cited paper and the corresponding citing papers – are the main consequence of several bibliometric-database errors. To reduce these errors, databases may undertake two actions: (i) improving the control of the (new) papers to be indexed, i.e., limiting the introduction of “new” dirty data, and (ii) detecting and correcting errors in the papers already indexed by the database, i.e., cleaning “old” dirty data. The latter action is probably more complicated, as it requires the application of suitable error-detection procedures to a huge amount of data. Based on an extensive sample of scientific papers in the Engineering-Manufacturing field, this study focuses on old dirty data in the Scopus and WoS databases. To this purpose, a recent automated algorithm for estimating the omitted-citation rate of databases is applied to the same sample of papers, but in three different-time sessions. A database’s ability to clean the old dirty data is evaluated considering the variations in the omitted-citation rate from session to session. The major outcomes of this study are that: (i) both databases slowly correct old omitted citations, and (ii) a small portion of initially corrected citations can surprisingly come off from databases over time.File | Dimensione | Formato | |
---|---|---|---|
ATTI_ISSI2015.pdf
accesso aperto
Descrizione: Articolo principale
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
501.21 kB
Formato
Adobe PDF
|
501.21 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2614476
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo