ScOut is a novel density-based preprocessing technique to scale out clustering. It reduces the original data set by condensing sets of points into macro-points, still preserving the original spatial and density information. The compact set of macro-points can then be efficiently clustered by means of almost any clustering technique, including highly-accurate algorithms. The experimental results show the efficiency and effectiveness of ScOut in scaling out traditional clustering algorithms, while preserving the quality of the generated clusters. Finally, ScOut is characterized by one single configuration parameter, step, that can be straightforwardly set, because the results of the clustering algorithms are stable for a wide range of step values. As future work we plan to investigate a solution to adjust the parameter step in an automatic manner when the dimensionality changes.

A Density-based Preprocessing Technique to Scale Out Clustering / Baralis, Elena; Garza, Paolo; Pastor, Eliana. - ELETTRONICO. - (2018), pp. 2078-2087. (Intervento presentato al convegno 2018 IEEE International Conference on Big Data (Big Data) tenutosi a Seattle, WA, USA nel December 10-13 2018) [10.1109/BigData.2018.8621870].

A Density-based Preprocessing Technique to Scale Out Clustering

Baralis, Elena;Garza, Paolo;Pastor, Eliana
2018

Abstract

ScOut is a novel density-based preprocessing technique to scale out clustering. It reduces the original data set by condensing sets of points into macro-points, still preserving the original spatial and density information. The compact set of macro-points can then be efficiently clustered by means of almost any clustering technique, including highly-accurate algorithms. The experimental results show the efficiency and effectiveness of ScOut in scaling out traditional clustering algorithms, while preserving the quality of the generated clusters. Finally, ScOut is characterized by one single configuration parameter, step, that can be straightforwardly set, because the results of the clustering algorithms are stable for a wide range of step values. As future work we plan to investigate a solution to adjust the parameter step in an automatic manner when the dimensionality changes.
2018
978-1-5386-5035-6
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2723877
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo