ScOut is a novel density-based preprocessing technique to scale out clustering. It reduces the original data set by condensing sets of points into macro-points, still preserving the original spatial and density information. The compact set of macro-points can then be efficiently clustered by means of almost any clustering technique, including highly-accurate algorithms. The experimental results show the efficiency and effectiveness of ScOut in scaling out traditional clustering algorithms, while preserving the quality of the generated clusters. Finally, ScOut is characterized by one single configuration parameter, step, that can be straightforwardly set, because the results of the clustering algorithms are stable for a wide range of step values. As future work we plan to investigate a solution to adjust the parameter step in an automatic manner when the dimensionality changes.
A Density-based Preprocessing Technique to Scale Out Clustering / Baralis, Elena; Garza, Paolo; Pastor, Eliana. - ELETTRONICO. - (2018), pp. 2078-2087. (Intervento presentato al convegno 2018 IEEE International Conference on Big Data (Big Data) tenutosi a Seattle, WA, USA nel December 10-13 2018) [10.1109/BigData.2018.8621870].
A Density-based Preprocessing Technique to Scale Out Clustering
Baralis, Elena;Garza, Paolo;Pastor, Eliana
2018
Abstract
ScOut is a novel density-based preprocessing technique to scale out clustering. It reduces the original data set by condensing sets of points into macro-points, still preserving the original spatial and density information. The compact set of macro-points can then be efficiently clustered by means of almost any clustering technique, including highly-accurate algorithms. The experimental results show the efficiency and effectiveness of ScOut in scaling out traditional clustering algorithms, while preserving the quality of the generated clusters. Finally, ScOut is characterized by one single configuration parameter, step, that can be straightforwardly set, because the results of the clustering algorithms are stable for a wide range of step values. As future work we plan to investigate a solution to adjust the parameter step in an automatic manner when the dimensionality changes.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2723877
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo