We present a multi-agent classification solution for identifying misogynous and aggressive content in Italian tweets. A first agent uses modern Sentence Embedding techniques to encode tweets and a SVM classifier to produce initial labels. A second agent, based on TF-IDF and Misogyny Italian lexicons, is jointly adopted to improve the first agent on uncertain predictions. We evaluate our approach in the Automatic Misogyny Identification Shared Task of the EVALITA 2020 campaign. Results show that TF-IDF and lexicons effectively improve the supervised agent trained on sentence embeddings.
PoliTeam@AMI: Improving Sentence Embedding Similarity with Misogyny Lexicons for Automatic Misogyny Identification in Italian Tweets / Attanasio, Giuseppe; Pastor, Eliana. - ELETTRONICO. - 2765:(2020). (Intervento presentato al convegno Evaluation Campaign of Natural Language Processing and Speech tools for Italian tenutosi a Online nel 17 December 2020).
PoliTeam@AMI: Improving Sentence Embedding Similarity with Misogyny Lexicons for Automatic Misogyny Identification in Italian Tweets
Attanasio, Giuseppe;Pastor, Eliana
2020
Abstract
We present a multi-agent classification solution for identifying misogynous and aggressive content in Italian tweets. A first agent uses modern Sentence Embedding techniques to encode tweets and a SVM classifier to produce initial labels. A second agent, based on TF-IDF and Misogyny Italian lexicons, is jointly adopted to improve the first agent on uncertain predictions. We evaluate our approach in the Automatic Misogyny Identification Shared Task of the EVALITA 2020 campaign. Results show that TF-IDF and lexicons effectively improve the supervised agent trained on sentence embeddings.File | Dimensione | Formato | |
---|---|---|---|
PoliTeam_AMI2020_Attanasio_Pastor_Preprint.pdf
accesso aperto
Descrizione: Preprint
Tipologia:
1. Preprint / submitted version [pre- review]
Licenza:
PUBBLICO - Tutti i diritti riservati
Dimensione
124.55 kB
Formato
Adobe PDF
|
124.55 kB | Adobe PDF | Visualizza/Apri |
EVALITA2020_paper142.pdf
accesso aperto
Descrizione: Articolo principale
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Creative commons
Dimensione
253.66 kB
Formato
Adobe PDF
|
253.66 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2854132