Concept drift is the phenomenon in which the underlying data distributions and statistical properties of a target domain change over time, leading to a degradation in model performance. Consequently, production models require continuous drift detection monitoring. Most drift detection methods to date are supervised, relying on ground-truth labels. However, they are inapplicable in many real-world scenarios, as true labels are often unavailable. Although recent efforts have proposed unsupervised drift detectors, many lack the accuracy required for reliable detection or are too computationally intensive for real-time use in high-dimensional, large-scale production environments. Moreover, they often fail to characterize or explain drift effectively. To address these limitations, we propose DRIFTLENS, an unsupervised framework for real-time concept drift detection and characterization. Designed for deep learning classifiers handling unstructured data, DRIFTLENS leverages distribution distances in deep learning representations to enable efficient and accurate detection. Additionally, it characterizes drift by analyzing and explaining its impact on each label. Our evaluation across classifiers and data-types demonstrates that DRIFTLENS (i) outperforms previous methods in detecting drift in 15/17 use cases; (ii) runs at least 5 times faster; (iii) produces drift curves that align closely with actual drift (correlation ≥ 0.85); (iv) effectively identifies representative drift samples as explanations.
Unsupervised Concept Drift Detection From Deep Learning Representations in Real-Time / Greco, Salvatore; Vacchetti, Bartolomeo; Apiletti, Daniele; Cerquitelli, Tania. - In: IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. - ISSN 1558-2191. - 37:10(2025), pp. 6232-6245. [10.1109/TKDE.2025.3593123]
Unsupervised Concept Drift Detection From Deep Learning Representations in Real-Time
Salvatore Greco;Bartolomeo Vacchetti;Daniele Apiletti;Tania Cerquitelli
2025
Abstract
Concept drift is the phenomenon in which the underlying data distributions and statistical properties of a target domain change over time, leading to a degradation in model performance. Consequently, production models require continuous drift detection monitoring. Most drift detection methods to date are supervised, relying on ground-truth labels. However, they are inapplicable in many real-world scenarios, as true labels are often unavailable. Although recent efforts have proposed unsupervised drift detectors, many lack the accuracy required for reliable detection or are too computationally intensive for real-time use in high-dimensional, large-scale production environments. Moreover, they often fail to characterize or explain drift effectively. To address these limitations, we propose DRIFTLENS, an unsupervised framework for real-time concept drift detection and characterization. Designed for deep learning classifiers handling unstructured data, DRIFTLENS leverages distribution distances in deep learning representations to enable efficient and accurate detection. Additionally, it characterizes drift by analyzing and explaining its impact on each label. Our evaluation across classifiers and data-types demonstrates that DRIFTLENS (i) outperforms previous methods in detecting drift in 15/17 use cases; (ii) runs at least 5 times faster; (iii) produces drift curves that align closely with actual drift (correlation ≥ 0.85); (iv) effectively identifies representative drift samples as explanations.File | Dimensione | Formato | |
---|---|---|---|
Unsupervised Concept Drift Detection from Deep Learning Representations in Real-time.pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Creative commons
Dimensione
37.27 MB
Formato
Adobe PDF
|
37.27 MB | Adobe PDF | Visualizza/Apri |
Unsupervised_Concept_Drift_Detection_From_Deep_Learning_Representations_in_Real-Time.pdf
accesso aperto
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Creative commons
Dimensione
8.19 MB
Formato
Adobe PDF
|
8.19 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3003026