In recent years, Large Language Models (LLMs) have been extensively used in several Software Engineering tasks, from requirements analysis to coding and software testing. Research has proved that LLMs can effectively generate software models to assist in software documentation. The goal of this study is to assess the capability of LLM agents to generate UML Use Case Diagrams (UCD), starting from software requirements in natural language. We perform the assessment in an educational setting, i.e., we evaluate the capability to solve software modeling exercises tailored for master’s students in SE curricula. Our results, based on the comparison of the results obtained by a human and an LLM solver on 17 UCD modeling exercises, show that LLMs have comparable results in terms of completeness and redundancy of the generated diagrams, with no significant difference if compared to human-proposed solutions.

Evaluating Large Language Models in Exercises of UML Use Case Diagrams Modeling / Garaccione, Giacomo; Vega Carrazan, Pablo Federico; Coppola, Riccardo; Ardito, Luca. - ELETTRONICO. - (2025), pp. 41-44. (Intervento presentato al convegno 2025 IEEE/ACM International Workshop on Natural Language-Based Software Engineering (NLBSE) tenutosi a Ottawa (CA) nel 27-28 April 2025) [10.1109/nlbse66842.2025.00015].

Evaluating Large Language Models in Exercises of UML Use Case Diagrams Modeling

Garaccione, Giacomo;Coppola, Riccardo;Ardito, Luca
2025

Abstract

In recent years, Large Language Models (LLMs) have been extensively used in several Software Engineering tasks, from requirements analysis to coding and software testing. Research has proved that LLMs can effectively generate software models to assist in software documentation. The goal of this study is to assess the capability of LLM agents to generate UML Use Case Diagrams (UCD), starting from software requirements in natural language. We perform the assessment in an educational setting, i.e., we evaluate the capability to solve software modeling exercises tailored for master’s students in SE curricula. Our results, based on the comparison of the results obtained by a human and an LLM solver on 17 UCD modeling exercises, show that LLMs have comparable results in terms of completeness and redundancy of the generated diagrams, with no significant difference if compared to human-proposed solutions.
2025
979-8-3315-3864-4
File in questo prodotto:
File Dimensione Formato  
Evaluating_Large_Language_Models_in_Exercises_of_UML_Use_Case_Diagrams_Modeling.pdf

accesso riservato

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 368.81 kB
Formato Adobe PDF
368.81 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3001271