Dense and low dimensional word embeddings opened up the possibility to analyze text polarity with highly successful deep learning techniques like Convolution Neural Networks. In this paper we utilize pretrained word vectors in combination with simple neural networks of stacked convolution and max-pooling layers, to explore the role of dataset size and document length in sentiment polarity prediction.We experiment with song lyrics and reviews of products or movies and see that convolution-pooling combination is very fast and yet quiet effective. We also find interesting relations between dataset size, text length and length of feature maps with classification accuracy. Our next goal is the design of a generic neural architecture for analyzing polarity of various text types, with high accuracy and few hyper-parameter changes.

Role of Data Properties on Sentiment Analysis of Texts via Convolutions / Çano, Erion; Morisio, Maurizio. - ELETTRONICO. - 745:(2018), pp. 330-337. ((Intervento presentato al convegno WorldCist'18 – 6th World Conference on Information Systems and Technologies tenutosi a Napoli, Italy nel March 27-29, 2018 [10.1007/978-3-319-77703-0_34].

Role of Data Properties on Sentiment Analysis of Texts via Convolutions

Çano Erion;Morisio Maurizio
2018

Abstract

Dense and low dimensional word embeddings opened up the possibility to analyze text polarity with highly successful deep learning techniques like Convolution Neural Networks. In this paper we utilize pretrained word vectors in combination with simple neural networks of stacked convolution and max-pooling layers, to explore the role of dataset size and document length in sentiment polarity prediction.We experiment with song lyrics and reviews of products or movies and see that convolution-pooling combination is very fast and yet quiet effective. We also find interesting relations between dataset size, text length and length of feature maps with classification accuracy. Our next goal is the design of a generic neural architecture for analyzing polarity of various text types, with high accuracy and few hyper-parameter changes.
File in questo prodotto:
File Dimensione Formato  
ErionCanoWorldCist18.pdf

Open Access dal 28/03/2019

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 241.46 kB
Formato Adobe PDF
241.46 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2696305
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo