Evaluating Large Language Models in Exercises of UML Class Diagram Modeling

De Bari, Daniele; Garaccione, Giacomo; Coppola, Riccardo; Torchiano, Marco; Ardito, Luca

doi:10.1145/3674805.3690741

Large Language Models (LLM) have rapidly affirmed in the latest years as a means to support or substitute human actors in a variety of tasks. LLM agents can generate valid software models, because of their inherent ability in evaluating textual requirements provided to them in the form of prompts.The goal of this work is to evaluate the capability of LLM agents to correctly generate UML class diagrams in activities of Requirements Modeling in the field of Software Engineering. Our aim is to evaluate LLMs in an educational setting, i.e., understanding how valuable are the results of LLMs when compared to results made by human actors, and how valuable can LLM be to generate sample solutions to provide to students. For that purpose, we collected 20 exercises from a diverse set of web sources and compared the models generated by a human and an LLM solver in terms of syntactic, semantic, pragmatic correctness, and distance from a provided reference solution. Our results show that the solutions generated by an LLM solver typically present a significantly higher number of errors in terms of semantic quality and textual difference against the provided reference solution, while no significant difference is found in syntactic and pragmatic quality. We can therefore conclude that, with a limited amount of errors mostly related to the textual content of the solution, UML diagrams generated by LLM agents have the same level of understandability as those generated by humans, and exhibit the same frequency in violating rules of UML Class Diagrams.

Evaluating Large Language Models in Exercises of UML Class Diagram Modeling / De Bari, Daniele; Garaccione, Giacomo; Coppola, Riccardo; Torchiano, Marco; Ardito, Luca. - ELETTRONICO. - (2024), pp. 393-399. ( ESEM '24:18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement Barcelona (ES) October 24 - 25, 2024) [10.1145/3674805.3690741].

Evaluating Large Language Models in Exercises of UML Class Diagram Modeling

De Bari, Daniele;Garaccione, Giacomo;Coppola, Riccardo;Torchiano, Marco;Ardito, Luca

2024

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2024
			
	Codice ISBN
	
				979-8-4007-1047-6
			
	Appare nelle tipologie
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
3674805.3690741.pdf accesso aperto Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Creative commons Dimensione 499.54 kB Formato Adobe PDF Visualizza/Apri	499.54 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2993437

PORTO @ Archivio Istituzionale della Ricerca

Evaluating Large Language Models in Exercises of UML Class Diagram Modeling

De Bari, Daniele;Garaccione, Giacomo;Coppola, Riccardo;Torchiano, Marco;Ardito, Luca

2024

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)