Personalization in Information Retrieval (IR) is a topic studied by the research community since a long time. Nevertheless, the availability of high-quality, real-world datasets for large-scale experiments and model evaluation remains limited. This paper helps to fill this gap by introducing SE-PQA (StackExchange - Personalized Question Answering), a new curated dataset designed for the development and evaluation of personalized models in the domain of community Question Answering (cQA). SE-PQA encompasses over one million queries and two million answers, annotated with a rich set of features that capture the social interactions among users on a cQA platform. We provide reproducible baseline methods for the cQA task based on the resource, including deep learning and personalized approaches. The results of the preliminary experiments conducted show the appropriateness of SE-PQA to train effective cQA models; they also show that personalization remarkably improves the effectiveness of all the methods tested.
SE-PQA: StackExchange Personalized Community Question Answering / Kasela, Pranav; Braga, Marco; Pasi, Gabriella; Perego, Raffaele. - 3802:(2024), pp. 99-102. (Intervento presentato al convegno 14th Italian Information Retrieval Workshop tenutosi a Udine (ITA) nel September 5-6, 2024).
SE-PQA: StackExchange Personalized Community Question Answering
Marco Braga;
2024
Abstract
Personalization in Information Retrieval (IR) is a topic studied by the research community since a long time. Nevertheless, the availability of high-quality, real-world datasets for large-scale experiments and model evaluation remains limited. This paper helps to fill this gap by introducing SE-PQA (StackExchange - Personalized Question Answering), a new curated dataset designed for the development and evaluation of personalized models in the domain of community Question Answering (cQA). SE-PQA encompasses over one million queries and two million answers, annotated with a rich set of features that capture the social interactions among users on a cQA platform. We provide reproducible baseline methods for the cQA task based on the resource, including deep learning and personalized approaches. The results of the preliminary experiments conducted show the appropriateness of SE-PQA to train effective cQA models; they also show that personalization remarkably improves the effectiveness of all the methods tested.| File | Dimensione | Formato | |
|---|---|---|---|
| sepqa_iir.pdf accesso aperto 
											Tipologia:
											2a Post-print versione editoriale / Version of Record
										 
											Licenza:
											
											
												Creative commons
												
												
													
													
													
												
												
											
										 
										Dimensione
										990.79 kB
									 
										Formato
										Adobe PDF
									 | 990.79 kB | Adobe PDF | Visualizza/Apri | 
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3002213
			
		
	
	
	
			      	