In modern web, users contact lots of services, identified by the domain name of the server. The temporal sequence and transitions of visited domains form a trajectory of the user on the web. In this work, we analyze 4 weeks of such trajectories, extracted from logs collected in our university network, and mine them via big data and machine learning methodologies to extract the interests of users. Our goal is to create a model of such trajectories and find similarities so to observe peculiarity of users’ browsing. Thanks to the model, we propose a methodology to automatically group together the trajectories of single users and/or communities into highly descriptive environments which in turn allow the analyst to identify the topic of interest. We propose an automatic way to highlight differences in terms of popularity and content of environments. Lastly, we analyze the transition among environments, showing how people in smaller communities, e.g., in the same department, have a much more homogeneous behaviour than people at large, e.g., in the university.
Mining and modeling web trajectories from passive traces / Vassio, Luca; Mellia, Marco; Figueiredo, Flavio; Couto da Silva, Ana Paula; Almeida, Jussara M.. - ELETTRONICO. - (2017), pp. 4016-4021. (Intervento presentato al convegno 2017 IEEE International Conference on Big Data (BIGDATA) tenutosi a Boston (USA) nel 11-14 Dicembre 201) [10.1109/BigData.2017.8258416].
Mining and modeling web trajectories from passive traces
Vassio, Luca;Mellia, Marco;Couto da Silva, Ana Paula;
2017
Abstract
In modern web, users contact lots of services, identified by the domain name of the server. The temporal sequence and transitions of visited domains form a trajectory of the user on the web. In this work, we analyze 4 weeks of such trajectories, extracted from logs collected in our university network, and mine them via big data and machine learning methodologies to extract the interests of users. Our goal is to create a model of such trajectories and find similarities so to observe peculiarity of users’ browsing. Thanks to the model, we propose a methodology to automatically group together the trajectories of single users and/or communities into highly descriptive environments which in turn allow the analyst to identify the topic of interest. We propose an automatic way to highlight differences in terms of popularity and content of environments. Lastly, we analyze the transition among environments, showing how people in smaller communities, e.g., in the same department, have a much more homogeneous behaviour than people at large, e.g., in the university.File | Dimensione | Formato | |
---|---|---|---|
DS4N.pdf
accesso aperto
Descrizione: post-print
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
PUBBLICO - Tutti i diritti riservati
Dimensione
940.23 kB
Formato
Adobe PDF
|
940.23 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2697991
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo