Big Data frameworks allow powerful distributed computations extending the results achievable on a single machine. In this work, we present a novel distributed associative classifier, named BAC, based on ensemble techniques. Ensembles are a popular approach that builds several models on different subsets of the original dataset, eventually voting to provide a unique classification outcome. Experiments on Apache Spark and preliminary results showed the capability of the proposed ensemble classifier to obtain a quality comparable with the single-machine version on popular real-world datasets, and overcome their scalability limits on large synthetic datasets.

BAC: A bagged associative classifier for big data frameworks / Venturini, Luca; Garza, Paolo; Apiletti, Daniele. - STAMPA. - 637:(2016), pp. 137-146. (Intervento presentato al convegno 3rd International Workshop on Big Data Applications and Principles, BigDap 2016, co-located with the 20th East-European Conference on Advances in Databases and Information Systems, ADBIS 2016 tenutosi a Prague, Czech Republic nel 28-8-2016) [10.1007/978-3-319-44066-8_15].

BAC: A bagged associative classifier for big data frameworks

VENTURINI, LUCA;GARZA, PAOLO;APILETTI, DANIELE
2016

Abstract

Big Data frameworks allow powerful distributed computations extending the results achievable on a single machine. In this work, we present a novel distributed associative classifier, named BAC, based on ensemble techniques. Ensembles are a popular approach that builds several models on different subsets of the original dataset, eventually voting to provide a unique classification outcome. Experiments on Apache Spark and preliminary results showed the capability of the proposed ensemble classifier to obtain a quality comparable with the single-machine version on popular real-world datasets, and overcome their scalability limits on large synthetic datasets.
2016
9783319440651
File in questo prodotto:
File Dimensione Formato  
bac-bigdap-paper (17).pdf

accesso aperto

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 488.52 kB
Formato Adobe PDF
488.52 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2651083
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo