Most machine learning models are trained on historical data to learn a static mapping between their input and output variables. However, they are deployed on continuously streamed data, whose nature is likely to change over time (data or concept drift). As a consequence, model performance may suddenly and substantially degrade, forcing practitioners to continuously update the models to reflect the new data distribution. Few methods, however, are available to reliably detect data drift on heterogeneous data types (structured and unstructured), possibly without requiring labeled data at inference time. In this paper, we review existing methods for dataset drift detection, discuss their applicability to deep neural networks, and experiment on a practical case study related to semi-structured document analysis.

Detecting drift in deep learning: A methodology primer / Piano, Luca; Garcea, Fabio; Gatteschi, Valentina; Lamberti, Fabrizio; Morra, Lia. - In: IT PROFESSIONAL. - ISSN 1520-9202. - (In corso di stampa).

Detecting drift in deep learning: A methodology primer

Piano, Luca;Garcea, Fabio;Gatteschi, Valentina;Lamberti, Fabrizio;Morra, Lia
In corso di stampa

Abstract

Most machine learning models are trained on historical data to learn a static mapping between their input and output variables. However, they are deployed on continuously streamed data, whose nature is likely to change over time (data or concept drift). As a consequence, model performance may suddenly and substantially degrade, forcing practitioners to continuously update the models to reflect the new data distribution. Few methods, however, are available to reliably detect data drift on heterogeneous data types (structured and unstructured), possibly without requiring labeled data at inference time. In this paper, we review existing methods for dataset drift detection, discuss their applicability to deep neural networks, and experiment on a practical case study related to semi-structured document analysis.
File in questo prodotto:
File Dimensione Formato  
ITPro_Drift_Detection_Last.pdf

non disponibili

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 962.52 kB
Formato Adobe PDF
962.52 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11583/2970203