Most machine learning models are trained on historical data to learn a static mapping between their input and output variables. However, they are deployed on continuously streamed data, whose nature is likely to change over time (data or concept drift). As a consequence, model performance may suddenly and substantially degrade, forcing practitioners to continuously update the models to reflect the new data distribution. Few methods, however, are available to reliably detect data drift on heterogeneous data types (structured and unstructured), possibly without requiring labeled data at inference time. In this paper, we review existing methods for dataset drift detection, discuss their applicability to deep neural networks, and experiment on a practical case study related to semi-structured document analysis.

Detecting drift in deep learning: A methodology primer / Piano, Luca; Garcea, Fabio; Gatteschi, Valentina; Lamberti, Fabrizio; Morra, Lia. - In: IT PROFESSIONAL. - ISSN 1520-9202. - 24:5(2022), pp. 53-60. [10.1109/MITP.2022.3191318]

Detecting drift in deep learning: A methodology primer

Piano, Luca;Garcea, Fabio;Gatteschi, Valentina;Lamberti, Fabrizio;Morra, Lia
2022

Abstract

Most machine learning models are trained on historical data to learn a static mapping between their input and output variables. However, they are deployed on continuously streamed data, whose nature is likely to change over time (data or concept drift). As a consequence, model performance may suddenly and substantially degrade, forcing practitioners to continuously update the models to reflect the new data distribution. Few methods, however, are available to reliably detect data drift on heterogeneous data types (structured and unstructured), possibly without requiring labeled data at inference time. In this paper, we review existing methods for dataset drift detection, discuss their applicability to deep neural networks, and experiment on a practical case study related to semi-structured document analysis.
File in questo prodotto:
File Dimensione Formato  
ITPro_Drift_Detection_Last.pdf

non disponibili

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 962.52 kB
Formato Adobe PDF
962.52 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Piano_Detecting_Drift_in_Deep_Learning_A_Methodology_Primer.pdf

non disponibili

Descrizione: versione finale pubblicata
Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 1.57 MB
Formato Adobe PDF
1.57 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2970203