Inferring feature importance is a well-known machine learning problem. Giving importance scores to the input data features is particularly helpful for explaining black-box models. Existing approaches rely on either statistical or Neural Network-based methods. Among them, Shapley Value estimates are among the mostly used scores to explain individual classification models or ensemble methods. As a drawback, state-of-the-art neural network-based approaches neglects the uncertainty of the input predictions while computing the confidence intervals of the feature importance scores. The paper extends a state-of-the-art neural method for Shapley Value estimation to handle uncertain predictions made by ensemble methods and to estimate a confidence interval for the feature importances. The results show that (1) The estimated confidence intervals are coherent with the expectation and more reliable than baseline methods; (2) The efficiency of the Shapley value estimator is comparable to those of traditional models; (3) The level of uncertainty of the Shapley value estimates decreases while producing ensembles of larger numbers of predictors.
Learning Confidence Intervals for Feature Importance: A Fast Shapley-based Approach / Napolitano, Davide; Vaiani, Lorenzo; Cagliero, Luca. - ELETTRONICO. - 3379:(2023). (Intervento presentato al convegno Data Analytics solutions for Real-LIfe APplications (DARLI-AP) tenutosi a Ioannina (Greece) nel March 28-31, 2023).
Learning Confidence Intervals for Feature Importance: A Fast Shapley-based Approach
Napolitano,Davide;Vaiani, Lorenzo;Cagliero, Luca
2023
Abstract
Inferring feature importance is a well-known machine learning problem. Giving importance scores to the input data features is particularly helpful for explaining black-box models. Existing approaches rely on either statistical or Neural Network-based methods. Among them, Shapley Value estimates are among the mostly used scores to explain individual classification models or ensemble methods. As a drawback, state-of-the-art neural network-based approaches neglects the uncertainty of the input predictions while computing the confidence intervals of the feature importance scores. The paper extends a state-of-the-art neural method for Shapley Value estimation to handle uncertain predictions made by ensemble methods and to estimate a confidence interval for the feature importances. The results show that (1) The estimated confidence intervals are coherent with the expectation and more reliable than baseline methods; (2) The efficiency of the Shapley value estimator is comparable to those of traditional models; (3) The level of uncertainty of the Shapley value estimates decreases while producing ensembles of larger numbers of predictors.| File | Dimensione | Formato | |
|---|---|---|---|
| DARLI-AP_2023_5.pdf accesso aperto 
											Tipologia:
											2a Post-print versione editoriale / Version of Record
										 
											Licenza:
											
											
												Creative commons
												
												
													
													
													
												
												
											
										 
										Dimensione
										1.67 MB
									 
										Formato
										Adobe PDF
									 | 1.67 MB | Adobe PDF | Visualizza/Apri | 
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2978548
			
		
	
	
	
			      	