Conducting a comprehensive literature review is a critical step in the research process, often requiring significant time and effort to identify and evaluate relevant academic papers. Traditional methods rely heavily on keyword-based searches to filter potentially relevant articles, which are then manually reviewed for inclusion. This process is not only labor-intensive but also susceptible to subjective biases, potentially affecting the consistency and accuracy of the review. This paper explores the potential of using large language models (LLMs) to automate the relevance assessment phase of literature reviews, thereby addressing some of the limitations of traditional methods. Specifically, we investigate the application of few-shot learning with various LLMs to determine the relevance of papers identified through keyword searches. We evaluate the sensitivity of this approach to the number of shots provided and compare the performance across different open-source LLMs, including Llama-3, Mistral, and Phi-3. Our findings aim to provide insights into the effectiveness of using LLMs for literature review processes, potentially transforming how researchers conduct literature reviews.
Large Language Models-aided Literature Reviews: A Study on Few-Shot Relevance Classification / Giobergia, Flavio; Koudounas, Alkis; Baralis, Elena. - (2024). (Intervento presentato al convegno 2024 IEEE 18th International Conference on Application of Information and Communication Technologies (AICT) tenutosi a Turin, Italy nel 25-27 September 2024) [10.1109/AICT61888.2024.10740404].
Large Language Models-aided Literature Reviews: A Study on Few-Shot Relevance Classification
Flavio Giobergia;Alkis Koudounas;Elena Baralis
2024
Abstract
Conducting a comprehensive literature review is a critical step in the research process, often requiring significant time and effort to identify and evaluate relevant academic papers. Traditional methods rely heavily on keyword-based searches to filter potentially relevant articles, which are then manually reviewed for inclusion. This process is not only labor-intensive but also susceptible to subjective biases, potentially affecting the consistency and accuracy of the review. This paper explores the potential of using large language models (LLMs) to automate the relevance assessment phase of literature reviews, thereby addressing some of the limitations of traditional methods. Specifically, we investigate the application of few-shot learning with various LLMs to determine the relevance of papers identified through keyword searches. We evaluate the sensitivity of this approach to the number of shots provided and compare the performance across different open-source LLMs, including Llama-3, Mistral, and Phi-3. Our findings aim to provide insights into the effectiveness of using LLMs for literature review processes, potentially transforming how researchers conduct literature reviews.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2996188
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo