In today’s world, social networks and online communities continuously generate tons of data that reflect users’ habits, personal interests, opinions and emotions. However, little profit can be gained from such huge raw data collections unless we are able to translate them into useful knowledge. Microblogs like Twitter have recently attracted a great body of research works to mine useful insights about users interests and preferences in different geographical areas and time periods. Indeed, the rather heterogeneous dimensions characterizing Twitter data, such as space, time and text content, impose innovative methods in the data mining discovery process. This paper presents TCharM, a data analytics methodology based on cluster analysis and association rule discovery to gain interesting knowledge from large collections of Twitter data. TCharM explores tweet collections along the three dimensions characterizing tweets (i.e., text content, posting time and place) to support context-aware topic trend analysis. To discover groups of tweets with a good cohesion on the three tweet features, TCharM exploits a novel distance measure (TASTE) which allows driving the clustering task by considering in one step the three tweet features. Association rule analysis is then exploited to concisely describe the cluster content with a set of understandable and significant patterns which reveal underlying correlations among frequent topics, tweeting times and places. TCharM can provide useful information to understand the evolution of people’s involvement in different topics, across geographical areas and over time. TCharM find applications in various domains by providing a valuable support in decision making to domain experts. The experimental evaluation performed on real datasets demonstrates the effectiveness of the proposed approach in discovering cohesive clusters and actionable knowledge from Twitter data.
|Titolo:||Twitter data laid almost bare: An insightful exploratory analyser|
|Data di pubblicazione:||2017|
|Digital Object Identifier (DOI):||10.1016/j.eswa.2017.08.017|
|Appare nelle tipologie:||1.1 Articolo in rivista|
File in questo prodotto: