Frequent closed itemset mining is among the most complex exploratory techniques in data mining, and provides the ability to discover hidden correlations in transactional datasets. The explosion of Big Data is leading to new parallel and distributed approaches. Unfortunately, most of them are designed to cope with low-dimensional datasets, whereas no distributed high-dimensional frequent closed itemset mining algorithms exists. This work introduces PaMPa-HD, a parallel MapReduce-based frequent closed itemset mining algorithm for high-dimensional datasets, based on Carpenter. The experimental results, performed on both real and synthetic datasets, show the efficiency and scalability of PaMPa-HD.
PaMPa-HD: A Parallel MapReduce-Based Frequent Pattern Miner for High-Dimensional Data / Apiletti, Daniele; Baralis, ELENA MARIA; Cerquitelli, Tania; Garza, Paolo; Michiardi, Pietro; Pulvirenti, Fabio. - (2015), pp. 839-846. (Intervento presentato al convegno The 3rd International Workshop on High Dimensional Data Mining (HDM’15) tenutosi a Atlantic City, NJ, USA nel 14 November 2015) [10.1109/ICDMW.2015.18].
PaMPa-HD: A Parallel MapReduce-Based Frequent Pattern Miner for High-Dimensional Data
APILETTI, DANIELE;BARALIS, ELENA MARIA;CERQUITELLI, TANIA;GARZA, PAOLO;MICHIARDI, PIETRO;PULVIRENTI, FABIO
2015
Abstract
Frequent closed itemset mining is among the most complex exploratory techniques in data mining, and provides the ability to discover hidden correlations in transactional datasets. The explosion of Big Data is leading to new parallel and distributed approaches. Unfortunately, most of them are designed to cope with low-dimensional datasets, whereas no distributed high-dimensional frequent closed itemset mining algorithms exists. This work introduces PaMPa-HD, a parallel MapReduce-based frequent closed itemset mining algorithm for high-dimensional datasets, based on Carpenter. The experimental results, performed on both real and synthetic datasets, show the efficiency and scalability of PaMPa-HD.File | Dimensione | Formato | |
---|---|---|---|
Pampa_Draft.pdf
accesso aperto
Descrizione: Draft
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
520.75 kB
Formato
Adobe PDF
|
520.75 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2639879
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo