Quality of Word Embeddings on Sentiment Analysis Tasks

Çano, Erion; Morisio, Maurizio

doi:10.1007/978-3-319-59569-6_42

Word embeddings or distributed representations of words are being used in various applications like machine translation, sentiment analysis, topic identification etc. Quality of word embeddings and performance of their applications depends on several factors like training method, corpus size and relevance etc. In this study we compare performance of a dozen of pretrained word embedding models on lyrics sentiment analysis and movie review polarity tasks. According to our results, Twitter Tweets is the best on lyrics sentiment analysis, whereas Google News and Common Crawl are the top performers on movie polarity analysis. Glove trained models slightly outrun those trained with Skipgram. Also, factors like topic relevance and size of corpus significantly impact the quality of the models. When medium or large-sized text sets are available, obtaining word embeddings from same training dataset is usually the best choice.

Quality of Word Embeddings on Sentiment Analysis Tasks / Çano, Erion; Morisio, Maurizio. - ELETTRONICO. - 10260:(2017), pp. 332-338. (Intervento presentato al convegno NLDB 2017 22nd International Conference on Natural Language & Information Systems tenutosi a Liege, Belgium nel 21 - 23 June 2017) [10.1007/978-3-319-59569-6_42].

Quality of Word Embeddings on Sentiment Analysis Tasks

Çano, Erion;Morisio, Maurizio

2017

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2017
			
	Codice ISBN
	
				978-3-319-59569-6
			
	Appare nelle tipologie
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
ErionCanoWordEmbNLDB2017.pdf accesso aperto Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: Pubblico - Tutti i diritti riservati Dimensione 461.39 kB Formato Adobe PDF Visualizza/Apri	461.39 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2668229

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

PORTO @ Archivio Istituzionale della Ricerca

Quality of Word Embeddings on Sentiment Analysis Tasks

Çano, Erion;Morisio, Maurizio

2017

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Attenzione

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)