Large Language Models (LLMs) have emerged in recent years as an effective technology for various software related applications such as requirements definition, code generation and analysis, and software testing, thanks to their effective text analysis capabilities. Among these activities, the automated generation of Unified Modeling Language (UML) class diagrams is a field that has been explored with varying levels of success. This paper builds on previous work comparing human-made and LLM-generated diagrams by evaluating the performance of four state-of-the-art LLMs, including both proprietary and open-source models, in the generation of class diagrams as solutions of university exercises defined via natural language specifications. By using role-based few-shot prompting strategies, we generated class diagrams and evaluated them according to standard evaluation frameworks focusing on adherence to syntactic rules, semantic accuracy, and completeness with respect to the exercises' domain. The results show that all models displayed a reasonable capability to generate sufficiently complete diagrams, although with differences in their strengths and weaknesses: proprietary models (namely, ChatGPT and Gemini) excelled in completeness (avg. 64.2\% for ChatGPT) and syntax quality (0 errors for both on all the exercises), DeepSeek proved to be the best in following semantic constraints (avg. 3 errors), and Qwen, while achieving similar completeness scores to the other models, struggled with following syntactic rules. These findings highlight the potential of LLMs as educational modeling assistants with their varying degrees of competence, suggesting future benefits with their integration in educational tool-based modeling environments.
A comparison of different Large Language Models for the generation of UML class diagrams / Garaccione, Giacomo; Calabrese, Diego Maria; Coppola, Riccardo; Ardito, Luca. - ELETTRONICO. - (In corso di stampa). (Intervento presentato al convegno ACM / IEEE 28th International Conference on Model Driven Engineering Languages and Systems (MODELS) tenutosi a Grand Rapids, Michigan (USA) nel 05/10/2025 - 10/10/2025).
A comparison of different Large Language Models for the generation of UML class diagrams
Garaccione,Giacomo;Coppola, Riccardo;Ardito, Luca
In corso di stampa
Abstract
Large Language Models (LLMs) have emerged in recent years as an effective technology for various software related applications such as requirements definition, code generation and analysis, and software testing, thanks to their effective text analysis capabilities. Among these activities, the automated generation of Unified Modeling Language (UML) class diagrams is a field that has been explored with varying levels of success. This paper builds on previous work comparing human-made and LLM-generated diagrams by evaluating the performance of four state-of-the-art LLMs, including both proprietary and open-source models, in the generation of class diagrams as solutions of university exercises defined via natural language specifications. By using role-based few-shot prompting strategies, we generated class diagrams and evaluated them according to standard evaluation frameworks focusing on adherence to syntactic rules, semantic accuracy, and completeness with respect to the exercises' domain. The results show that all models displayed a reasonable capability to generate sufficiently complete diagrams, although with differences in their strengths and weaknesses: proprietary models (namely, ChatGPT and Gemini) excelled in completeness (avg. 64.2\% for ChatGPT) and syntax quality (0 errors for both on all the exercises), DeepSeek proved to be the best in following semantic constraints (avg. 3 errors), and Qwen, while achieving similar completeness scores to the other models, struggled with following syntactic rules. These findings highlight the potential of LLMs as educational modeling assistants with their varying degrees of competence, suggesting future benefits with their integration in educational tool-based modeling environments.File | Dimensione | Formato | |
---|---|---|---|
A comparison of different Large Language Models for the generation of UML class diagrams.pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
719.48 kB
Formato
Adobe PDF
|
719.48 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3003614