Many high performance machine learning methods produce black box models, which do not disclose their internal logic yielding the prediction. However, in many application domains understanding the motivation of a prediction is becoming a requisite to trust the prediction itself. We propose a novel rule-based method that explains the prediction of any classifier on a specific instance by analyzing the joint effect of feature subsets on the classifier prediction. The relevant subsets are identified by learning a local rule-based model in the neighborhood of the prediction to explain. While local rules give a qualitative insight of the local behavior, their relevance is quantified by using the concept of prediction difference. Preliminary experiments show that, despite the approximation introduced by the local model, the explanations provided by our method are effective in detecting the effects of attribute correlation. Our method is model-agnostic. Hence, experts can compare explanations and local behaviors of the predictions for the same instance made by different classifiers.
Explaining black box models by means of local rules / Pastor, E.; Baralis, E.. - ELETTRONICO. - 147772:(2019), pp. 510-517. (Intervento presentato al convegno 34th Annual ACM Symposium on Applied Computing, SAC 2019 tenutosi a Limassol (CY) nel 2019) [10.1145/3297280.3297328].
Explaining black box models by means of local rules
Pastor E.;Baralis E.
2019
Abstract
Many high performance machine learning methods produce black box models, which do not disclose their internal logic yielding the prediction. However, in many application domains understanding the motivation of a prediction is becoming a requisite to trust the prediction itself. We propose a novel rule-based method that explains the prediction of any classifier on a specific instance by analyzing the joint effect of feature subsets on the classifier prediction. The relevant subsets are identified by learning a local rule-based model in the neighborhood of the prediction to explain. While local rules give a qualitative insight of the local behavior, their relevance is quantified by using the concept of prediction difference. Preliminary experiments show that, despite the approximation introduced by the local model, the explanations provided by our method are effective in detecting the effects of attribute correlation. Our method is model-agnostic. Hence, experts can compare explanations and local behaviors of the predictions for the same instance made by different classifiers.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2749755
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo