We propose an unsupervised, model-agnostic, wrapper method for feature selection. We assume that if a feature can be predicted using the others, it adds little information to the problem, and therefore could be removed without impairing the performance of whatever model will be eventually built. The proposed method iteratively identifies and removes predictable, or nearly-predictable, redundant features, allowing to trade-off complexity with expected quality. The approach do not rely on target labels nor values, and the model used to identify predictable features is not related to the final use of the feature set. Therefore, it can be used for supervised, unsupervised, or semi-supervised problems, or even as a safe, pre-processing step to improve the quality of the results of other feature selection techniques. Experimental results against state-of-the-art feature-selection algorithms show satisfying performance on several non-trivial benchmarks.

Predictable Features Elimination: An Unsupervised Approach to Feature Selection / Barbiero, Pietro; Squillero, Giovanni; Tonda, Alberto. - STAMPA. - 13163:(2022), pp. 399-412. (Intervento presentato al convegno LOD 2021: Machine Learning, Optimization, and Data Science) [10.1007/978-3-030-95467-3_29].

Predictable Features Elimination: An Unsupervised Approach to Feature Selection

Squillero, Giovanni;
2022

Abstract

We propose an unsupervised, model-agnostic, wrapper method for feature selection. We assume that if a feature can be predicted using the others, it adds little information to the problem, and therefore could be removed without impairing the performance of whatever model will be eventually built. The proposed method iteratively identifies and removes predictable, or nearly-predictable, redundant features, allowing to trade-off complexity with expected quality. The approach do not rely on target labels nor values, and the model used to identify predictable features is not related to the final use of the feature set. Therefore, it can be used for supervised, unsupervised, or semi-supervised problems, or even as a safe, pre-processing step to improve the quality of the results of other feature selection techniques. Experimental results against state-of-the-art feature-selection algorithms show satisfying performance on several non-trivial benchmarks.
2022
978-3-030-95466-6
978-3-030-95467-3
File in questo prodotto:
File Dimensione Formato  
2021_LOD___Predictable_Feature_Elimination__take_3_.pdf

Open Access dal 04/02/2023

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 480.29 kB
Formato Adobe PDF
480.29 kB Adobe PDF Visualizza/Apri
2022 lod.pdf

non disponibili

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 2.53 MB
Formato Adobe PDF
2.53 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2954788