In the last ten years, with the explosion of the usage of Internet, network traffic analytics and data mining issues have taken primary importance. Generalized itemset mining is an established data mining technique which allows us to discover multiple-level correlations among data equipped with analyst-provided taxonomies. In this work, we address the discovery of a specific type of generalized itemsets, named misleading generalized itemsets (MGIs), which can be used to highlight anomalous situations in potentially large datasets. More specifically, MGIs are high-level patterns with a contrasting correlation type with respect to those of many of their descendant patterns according to the input taxonomy. This work proposes a new framework, named MGI-Cloud, which is able to efficiently extract misleading generalized itemsets. The framework is characterized by a distributed architecture and it is composed by a set of MapReduce jobs. As reference case study, MGI-Cloud has been applied to real network datasets, captured in different stages from a backbone link of an Italian ISP. The experiments demonstrate the effectiveness of our approach in a real-life scenario.

Network traffic analysis by means of Misleading Generalized Itemsets / Apiletti, Daniele; Baralis, ELENA MARIA; Cagliero, Luca; Cerquitelli, Tania; Chiusano, SILVIA ANNA; Garza, Paolo; Grimaudo, Luigi; Pulvirenti, Fabio. - (2014), pp. 39-52. (Intervento presentato al convegno First International Workshop on Big Data Applications and Principles (BIGDAP 2014) tenutosi a Madrid, Spagna nel 11-12 Settembre 2014).

Network traffic analysis by means of Misleading Generalized Itemsets

APILETTI, DANIELE;BARALIS, ELENA MARIA;CAGLIERO, LUCA;CERQUITELLI, TANIA;CHIUSANO, SILVIA ANNA;GARZA, PAOLO;GRIMAUDO, LUIGI;PULVIRENTI, FABIO
2014

Abstract

In the last ten years, with the explosion of the usage of Internet, network traffic analytics and data mining issues have taken primary importance. Generalized itemset mining is an established data mining technique which allows us to discover multiple-level correlations among data equipped with analyst-provided taxonomies. In this work, we address the discovery of a specific type of generalized itemsets, named misleading generalized itemsets (MGIs), which can be used to highlight anomalous situations in potentially large datasets. More specifically, MGIs are high-level patterns with a contrasting correlation type with respect to those of many of their descendant patterns according to the input taxonomy. This work proposes a new framework, named MGI-Cloud, which is able to efficiently extract misleading generalized itemsets. The framework is characterized by a distributed architecture and it is composed by a set of MapReduce jobs. As reference case study, MGI-Cloud has been applied to real network datasets, captured in different stages from a backbone link of an Italian ISP. The experiments demonstrate the effectiveness of our approach in a real-life scenario.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2573939
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo