Textual data such as tweets and news is abundant on the web. However, extracting useful information from such a deluge of data is hardly possible for a human. In this paper, we discuss automated text analysis methods based on sparse optimization. In particular, we use sparse PCA and Elastic Net regression for extracting intelligible topics from a big textual corpus and for obtaining time-based signals quantifying the strength of each topic in time. These signals can then be used as regressors for modeling or predicting other related numerical indices. We applied this setup to the analysis of the topics that arose during the 2016 US presidential elections, and we used the topic strength signals in order to model their influence on the election polls.

Topic Analysis in News Via Sparse Learning: A Case Study on the 2016 US Presidential Elections / Calafiore, Giuseppe Carlo; El Ghaoui, Laurent; Preziosi, Alessandro; Russo, Luigi. - ELETTRONICO. - (2017). (Intervento presentato al convegno 20th World Congress of the International Federation of Automatic Control (IFAC 2017) tenutosi a Tolouse, France nel July 9-14).

Topic Analysis in News Via Sparse Learning: A Case Study on the 2016 US Presidential Elections

CALAFIORE, Giuseppe Carlo;PREZIOSI, ALESSANDRO;
2017

Abstract

Textual data such as tweets and news is abundant on the web. However, extracting useful information from such a deluge of data is hardly possible for a human. In this paper, we discuss automated text analysis methods based on sparse optimization. In particular, we use sparse PCA and Elastic Net regression for extracting intelligible topics from a big textual corpus and for obtaining time-based signals quantifying the strength of each topic in time. These signals can then be used as regressors for modeling or predicting other related numerical indices. We applied this setup to the analysis of the topics that arose during the 2016 US presidential elections, and we used the topic strength signals in order to model their influence on the election polls.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2670956
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo