Nowadays, large volumes of data and measurements are being continuously generated by computer and telecommunication networks, but such volumes make it difficult to extract meaningful knowledge from them. This paper presents SaFe-NeC, an innovative methodology for analyzing network traffic by exploiting data mining techniques, i.e. clustering and classification algorithms, focusing on self-learning capabilities of state-of-the-art scalable approaches. Self-learning algorithms, coupled with self-assessment indicators and domain-driven semantics enriching data mining results, are able to build a model of the data with minimal user intervention and highlight possibly meaningful interpretations to domain experts. Furthermore, a self-evolving model evaluation phase is included to continuously track the quality degradation of the model itself, whose rebuilding is triggered as soon as quality indicators fall below a threshold of tolerance. The proposed methodology can exploit the computational advantages of distributed computing frameworks, as the current implementation runs on Apache Spark. Preliminary experimental results on a real traffic dataset show the full potential of the proposed methodology to characterize network traffic data.
SaFe-NeC: A scalable and flexible system for network data characterization / Apiletti, Daniele; Baralis, ELENA MARIA; Cerquitelli, Tania; Garza, Paolo; Venturini, Luca. - ELETTRONICO. - (2016), pp. 812-816. (Intervento presentato al convegno 2016 IEEE/IFIP Network Operations and Management Symposium, NOMS 2016 tenutosi a Istanbul (Turkey) nel 2016) [10.1109/NOMS.2016.7502905].
SaFe-NeC: A scalable and flexible system for network data characterization
APILETTI, DANIELE;BARALIS, ELENA MARIA;CERQUITELLI, TANIA;GARZA, PAOLO;VENTURINI, LUCA
2016
Abstract
Nowadays, large volumes of data and measurements are being continuously generated by computer and telecommunication networks, but such volumes make it difficult to extract meaningful knowledge from them. This paper presents SaFe-NeC, an innovative methodology for analyzing network traffic by exploiting data mining techniques, i.e. clustering and classification algorithms, focusing on self-learning capabilities of state-of-the-art scalable approaches. Self-learning algorithms, coupled with self-assessment indicators and domain-driven semantics enriching data mining results, are able to build a model of the data with minimal user intervention and highlight possibly meaningful interpretations to domain experts. Furthermore, a self-evolving model evaluation phase is included to continuously track the quality degradation of the model itself, whose rebuilding is triggered as soon as quality indicators fall below a threshold of tolerance. The proposed methodology can exploit the computational advantages of distributed computing frameworks, as the current implementation runs on Apache Spark. Preliminary experimental results on a real traffic dataset show the full potential of the proposed methodology to characterize network traffic data.File | Dimensione | Formato | |
---|---|---|---|
safenec.pdf
accesso aperto
Descrizione: postprint draft
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
110.12 kB
Formato
Adobe PDF
|
110.12 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2650989
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo