In today’s world, social networks and online communities continuously generate tons of data that reflect users’ habits, personal interests, opinions and emotions. However, little profit can be gained from such huge raw data collections unless we are able to translate them into useful knowledge. Microblogs like Twitter have recently attracted a great body of research works to mine useful insights about users interests and preferences in different geographical areas and time periods. Indeed, the rather heterogeneous dimensions characterizing Twitter data, such as space, time and text content, impose innovative methods in the data mining discovery process. This paper presents TCharM, a data analytics methodology based on cluster analysis and association rule discovery to gain interesting knowledge from large collections of Twitter data. TCharM explores tweet collections along the three dimensions characterizing tweets (i.e., text content, posting time and place) to support context-aware topic trend analysis. To discover groups of tweets with a good cohesion on the three tweet features, TCharM exploits a novel distance measure (TASTE) which allows driving the clustering task by considering in one step the three tweet features. Association rule analysis is then exploited to concisely describe the cluster content with a set of understandable and significant patterns which reveal underlying correlations among frequent topics, tweeting times and places. TCharM can provide useful information to understand the evolution of people’s involvement in different topics, across geographical areas and over time. TCharM find applications in various domains by providing a valuable support in decision making to domain experts. The experimental evaluation performed on real datasets demonstrates the effectiveness of the proposed approach in discovering cohesive clusters and actionable knowledge from Twitter data.

Twitter data laid almost bare: An insightful exploratory analyser / Xiao, Xin; Attanasio, Antonio; Chiusano, SILVIA ANNA; Cerquitelli, Tania. - In: EXPERT SYSTEMS WITH APPLICATIONS. - ISSN 0957-4174. - STAMPA. - 90:(2017), pp. 501-517. [10.1016/j.eswa.2017.08.017]

Twitter data laid almost bare: An insightful exploratory analyser

XIAO, XIN;ATTANASIO, ANTONIO;CHIUSANO, SILVIA ANNA;CERQUITELLI, TANIA
2017

Abstract

In today’s world, social networks and online communities continuously generate tons of data that reflect users’ habits, personal interests, opinions and emotions. However, little profit can be gained from such huge raw data collections unless we are able to translate them into useful knowledge. Microblogs like Twitter have recently attracted a great body of research works to mine useful insights about users interests and preferences in different geographical areas and time periods. Indeed, the rather heterogeneous dimensions characterizing Twitter data, such as space, time and text content, impose innovative methods in the data mining discovery process. This paper presents TCharM, a data analytics methodology based on cluster analysis and association rule discovery to gain interesting knowledge from large collections of Twitter data. TCharM explores tweet collections along the three dimensions characterizing tweets (i.e., text content, posting time and place) to support context-aware topic trend analysis. To discover groups of tweets with a good cohesion on the three tweet features, TCharM exploits a novel distance measure (TASTE) which allows driving the clustering task by considering in one step the three tweet features. Association rule analysis is then exploited to concisely describe the cluster content with a set of understandable and significant patterns which reveal underlying correlations among frequent topics, tweeting times and places. TCharM can provide useful information to understand the evolution of people’s involvement in different topics, across geographical areas and over time. TCharM find applications in various domains by providing a valuable support in decision making to domain experts. The experimental evaluation performed on real datasets demonstrates the effectiveness of the proposed approach in discovering cohesive clusters and actionable knowledge from Twitter data.
File in questo prodotto:
File Dimensione Formato  
expert_systems_2017.pdf

non disponibili

Descrizione: Versione editoriale
Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 1.89 MB
Formato Adobe PDF
1.89 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Paper_ESWA-2017.pdf

accesso aperto

Descrizione: versione sottomessa
Tipologia: 1. Preprint / submitted version [pre- review]
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 9.87 MB
Formato Adobe PDF
9.87 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2679661