Clustering real-world data is a challenging task, since many real-data collections are characterized by an inherent sparseness and variable distribution. An appealing domain that generates such data collections is the medical care scenario where collected data include a large cardinality of patient records and a variety of medical treatments usually adopted for a given disease pathology. This paper proposes a two-phase data mining methodology to iteratively analyze dierent dataset portions and locally identify groups of objects with common properties. Discovered cohesive clusters are then analyzed using sequential patterns to characterize temporal relationships among data features. To support an automatic classication of a new data objects within one of the discovered groups, a classication model is created starting from the computed cluster set. A mobile application has been also designed and developed to visualize and update data under analysis as well as categorizing new unlabeled records. A comparative study has been conducted on real datasets in the medical care scenario using diverse clustering algorithms. Results were compared in terms of cluster quality, execution time, classication performance and discovered sequential patterns. The experimental evaluation showed the eectiveness of MLC to discover interesting knowledge items and to easily exploit them through a mobile application. Results have been also discussed from a medical perspective.
Exploiting clustering algorithms in a multiple-level fashion: A comparative study in the medical care scenario / Cerquitelli, Tania; Chiusano, SILVIA ANNA; Xiao, Xin. - In: EXPERT SYSTEMS WITH APPLICATIONS. - ISSN 0957-4174. - STAMPA. - 55:(2016), pp. 297-312. [10.1016/j.eswa.2016.02.005]
Exploiting clustering algorithms in a multiple-level fashion: A comparative study in the medical care scenario
CERQUITELLI, TANIA;CHIUSANO, SILVIA ANNA;XIAO, XIN
2016
Abstract
Clustering real-world data is a challenging task, since many real-data collections are characterized by an inherent sparseness and variable distribution. An appealing domain that generates such data collections is the medical care scenario where collected data include a large cardinality of patient records and a variety of medical treatments usually adopted for a given disease pathology. This paper proposes a two-phase data mining methodology to iteratively analyze dierent dataset portions and locally identify groups of objects with common properties. Discovered cohesive clusters are then analyzed using sequential patterns to characterize temporal relationships among data features. To support an automatic classication of a new data objects within one of the discovered groups, a classication model is created starting from the computed cluster set. A mobile application has been also designed and developed to visualize and update data under analysis as well as categorizing new unlabeled records. A comparative study has been conducted on real datasets in the medical care scenario using diverse clustering algorithms. Results were compared in terms of cluster quality, execution time, classication performance and discovered sequential patterns. The experimental evaluation showed the eectiveness of MLC to discover interesting knowledge items and to easily exploit them through a mobile application. Results have been also discussed from a medical perspective.File | Dimensione | Formato | |
---|---|---|---|
MLC_paper_revised_Xin.pdf
Open Access dal 17/02/2018
Descrizione: Articolo principale
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Creative commons
Dimensione
1.12 MB
Formato
Adobe PDF
|
1.12 MB | Adobe PDF | Visualizza/Apri |
1-s2.0-S0957417416300288-main.pdf
accesso riservato
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
1.93 MB
Formato
Adobe PDF
|
1.93 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
ESWA_2016_Submitted.pdf
accesso aperto
Tipologia:
1. Preprint / submitted version [pre- review]
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
447.94 kB
Formato
Adobe PDF
|
447.94 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2630095