Analysis of Machine Learning Based Imputation of Missing Data

Syed Tahir Hussain Rizvia,; Muhammad Yasir Latif,; Muhammad Saad Amin,; Achraf Jabeur Telmoudi,; Shah, Nasir Ali

doi:10.1080/01969722.2023.2247257

Data analysis and classification can be affected by the availability of missing data in datasets. To deal with missing data, either deletion-based or imputation-based methods are used that results in the reduction of data records or wrong predicted value imputed by means/median respectively. A significant improvement can be done if missing values are imputed more accurately with less computation cost. In this work, a flow for analysis of machine learning-based algorithms for missing data imputation is proposed. The K-nearest neighbors (KNN) and Sequential KNN (SKNN) algorithms are used to impute missing values in datasets using machine learning. Missing values handled using statistical deletion approach (List-wise Deletion) and ML-based imputation methods (KNN and SKNN) is then tested and compared using different ML classifiers (Support Vector Machine and Decision Tree) to evaluate effectiveness of imputed data. The used algorithms are compared in terms of accuracy, and results yielded that the ML-based imputation method (SKNN) outperforms LD-based approach and KNN method in terms of effectiveness of handling missing data in almost every dataset with both classification algorithms (SVM and DT).

Analysis of Machine Learning Based Imputation of Missing Data / Tahir Hussain Rizvia, S., Yasir Latif, M., Saad Amin, M., Jabeur Telmoudi, A., Shah, N.A.. - In: CYBERNETICS AND SYSTEMS. - ISSN 1087-6553. - ELETTRONICO. - 56:6(2025), pp. 818-832. [10.1080/01969722.2023.2247257]

Analysis of Machine Learning Based Imputation of Missing Data

Syed Tahir Hussain Rizvia;Muhammad Yasir Latif;Muhammad Saad Amin;Achraf Jabeur Telmoudi;Nasir Ali Shah

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2025
			
	Codice DOI
	
				https://dx.doi.org/10.1080/01969722.2023.2247257
			
	Titolo della Rivista
	
				CYBERNETICS AND SYSTEMS
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Paper10.pdf accesso aperto Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: Creative commons Dimensione 422.96 kB Formato Adobe PDF Visualizza/Apri	422.96 kB	Adobe PDF	Visualizza/Apri
Analysis of Machine Learning Based Imputation of Missing Data.pdf accesso aperto Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Creative commons Dimensione 2.16 MB Formato Adobe PDF Visualizza/Apri	2.16 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2979607

PORTO @ Archivio Istituzionale della Ricerca

Analysis of Machine Learning Based Imputation of Missing Data

Syed Tahir Hussain Rizvia;Muhammad Yasir Latif;Muhammad Saad Amin;Achraf Jabeur Telmoudi;Nasir Ali Shah

2025

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)