Evaluating the degradation of predictive models over time has always been a difficult task, also considering that new unseen data might not fit the training distribution. This is a well-known problem in real-world use cases, where collecting the historical training set for all possible prediction labels may be very hard, too expensive or completely unfeasible. To solve this issue, we present a new unsupervised approach to detect and evaluate the degradation of classification and prediction models, based on a scalable variant of the Silhouette index, named Descriptor Silhouette, specifically designed to advance current Big Data state-of-the-art solutions. The newly proposed strategy has been tested and validated over both synthetic and real-world industrial use cases. To this aim, it has been included in a framework named SCALE and resulted to be efficient and more effective in assessing the degradation of prediction performance than current state-of-the-art best solutions.

A new unsupervised predictive-model self-assessment approach that SCALEs / Ventura, Francesco; Proto, Stefano; Apiletti, Daniele; Cerquitelli, Tania; Panicucci, Simone; Baralis, Elena; Macii, Enrico; Macii, Alberto. - ELETTRONICO. - 2019 IEEE International Congress on Big Data, BigData Congress 2019, Milan, Italy, July 8-13, 2019:(2019), pp. 144-148. (Intervento presentato al convegno 2019 IEEE International Congress on Big Data, BigData Congress 2019, Milan, Italy, July 8-13, 2019 tenutosi a Milan, Italy nel July 8-13, 2019) [10.1109/BigDataCongress.2019.00033].

A new unsupervised predictive-model self-assessment approach that SCALEs

Ventura Francesco;Proto Stefano;Apiletti Daniele;Cerquitelli Tania;Baralis Elena;Macii Enrico;Macii Alberto
2019

Abstract

Evaluating the degradation of predictive models over time has always been a difficult task, also considering that new unseen data might not fit the training distribution. This is a well-known problem in real-world use cases, where collecting the historical training set for all possible prediction labels may be very hard, too expensive or completely unfeasible. To solve this issue, we present a new unsupervised approach to detect and evaluate the degradation of classification and prediction models, based on a scalable variant of the Silhouette index, named Descriptor Silhouette, specifically designed to advance current Big Data state-of-the-art solutions. The newly proposed strategy has been tested and validated over both synthetic and real-world industrial use cases. To this aim, it has been included in a framework named SCALE and resulted to be efficient and more effective in assessing the degradation of prediction performance than current state-of-the-art best solutions.
2019
978-1-7281-2772-9
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2734504
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo