Micro-influencers are effective elements in the marketing strategies of companies and institutions because of their capability to create an hyper-engaged audience around a specific topic of interest. In recent years, many scientific approaches and commercial tools have handled the task of detecting this type of social media users. These strategies adopt solutions ranging from rule based machine learning models to deep neural networks and graph analysis on text, images and account information. This work compares the existing solutions and proposes an ensemble method to generalize them with different input data and social media platforms. The deployed solution combines deep learning models on unstructured data with statistical machine learning models on structured data. We retrieve both social media accounts information and multimedia posts on Twitter and Instagram. These data are mapped into feature vectors for an eXtreme Gradient Boosting (XGBoost) classifier. Sixty different topics have been analyzed to build a rule based gold standard dataset and to compare the performance of our approach against baseline classifiers. We prove the effectiveness of our work by comparing the accuracy, precision, recall, and f1 score of our model with different configurations and architectures. We obtained an accuracy of 0.98 with our best performing model.

MIMIC: a Multi Input Micro-Influencers Classifier / Leonardi, Simone; Ardito, Luca. - ELETTRONICO. - 1:(2022), pp. 168-175. ((Intervento presentato al convegno ICSMADM 2022: 16. International Conference on Social Media Analysis and Data Mining tenutosi a online nel May 26-27, 2022.

MIMIC: a Multi Input Micro-Influencers Classifier

Leonardi,Simone;Ardito,Luca
2022

Abstract

Micro-influencers are effective elements in the marketing strategies of companies and institutions because of their capability to create an hyper-engaged audience around a specific topic of interest. In recent years, many scientific approaches and commercial tools have handled the task of detecting this type of social media users. These strategies adopt solutions ranging from rule based machine learning models to deep neural networks and graph analysis on text, images and account information. This work compares the existing solutions and proposes an ensemble method to generalize them with different input data and social media platforms. The deployed solution combines deep learning models on unstructured data with statistical machine learning models on structured data. We retrieve both social media accounts information and multimedia posts on Twitter and Instagram. These data are mapped into feature vectors for an eXtreme Gradient Boosting (XGBoost) classifier. Sixty different topics have been analyzed to build a rule based gold standard dataset and to compare the performance of our approach against baseline classifiers. We prove the effectiveness of our work by comparing the accuracy, precision, recall, and f1 score of our model with different configurations and architectures. We obtained an accuracy of 0.98 with our best performing model.
File in questo prodotto:
File Dimensione Formato  
22ES050097.pdf

accesso aperto

Descrizione: Articolo Principale
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 302.81 kB
Formato Adobe PDF
302.81 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11583/2964809