Large Language Models (LLMs) have emerged in recent years as an effective technology for various software related applications such as requirements definition, code generation and analysis, and software testing, thanks to their effective text analysis capabilities. Among these activities, the automated generation of Unified Modeling Language (UML) class diagrams is a field that has been explored with varying levels of success. This paper builds on previous work comparing human-made and LLM-generated diagrams by evaluating the performance of four state-of-the-art LLMs, including both proprietary and open-source models, in the generation of class diagrams as solutions of university exercises defined via natural language specifications. By using role-based few-shot prompting strategies, we generated class diagrams and evaluated them according to standard evaluation frameworks focusing on adherence to syntactic rules, semantic accuracy, and completeness with respect to the exercises' domain. The results show that all models displayed a reasonable capability to generate sufficiently complete diagrams, although with differences in their strengths and weaknesses: proprietary models (namely, ChatGPT and Gemini) excelled in completeness (avg. 64.2\% for ChatGPT) and syntax quality (0 errors for both on all the exercises), DeepSeek proved to be the best in following semantic constraints (avg. 3 errors), and Qwen, while achieving similar completeness scores to the other models, struggled with following syntactic rules. These findings highlight the potential of LLMs as educational modeling assistants with their varying degrees of competence, suggesting future benefits with their integration in educational tool-based modeling environments.
A comparison of different Large Language Models for the generation of UML class diagrams / Garaccione, Giacomo; Calabrese, Diego Maria; Coppola, Riccardo; Ardito, Luca. - ELETTRONICO. - (2025). ( ACM / IEEE 28th International Conference on Model Driven Engineering Languages and Systems (MODELS) Grand Rapids, Michigan (USA) 05-10 October 2025) [10.1109/MODELS-C68889.2025.00078].
A comparison of different Large Language Models for the generation of UML class diagrams
Garaccione,Giacomo;Coppola, Riccardo;Ardito, Luca
2025
Abstract
Large Language Models (LLMs) have emerged in recent years as an effective technology for various software related applications such as requirements definition, code generation and analysis, and software testing, thanks to their effective text analysis capabilities. Among these activities, the automated generation of Unified Modeling Language (UML) class diagrams is a field that has been explored with varying levels of success. This paper builds on previous work comparing human-made and LLM-generated diagrams by evaluating the performance of four state-of-the-art LLMs, including both proprietary and open-source models, in the generation of class diagrams as solutions of university exercises defined via natural language specifications. By using role-based few-shot prompting strategies, we generated class diagrams and evaluated them according to standard evaluation frameworks focusing on adherence to syntactic rules, semantic accuracy, and completeness with respect to the exercises' domain. The results show that all models displayed a reasonable capability to generate sufficiently complete diagrams, although with differences in their strengths and weaknesses: proprietary models (namely, ChatGPT and Gemini) excelled in completeness (avg. 64.2\% for ChatGPT) and syntax quality (0 errors for both on all the exercises), DeepSeek proved to be the best in following semantic constraints (avg. 3 errors), and Qwen, while achieving similar completeness scores to the other models, struggled with following syntactic rules. These findings highlight the potential of LLMs as educational modeling assistants with their varying degrees of competence, suggesting future benefits with their integration in educational tool-based modeling environments.| File | Dimensione | Formato | |
|---|---|---|---|
|
A comparison of different Large Language Models for the generation of UML class diagrams.pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
719.48 kB
Formato
Adobe PDF
|
719.48 kB | Adobe PDF | Visualizza/Apri |
|
A_comparison_of_different_Large_Language_Models_for_the_generation_of_UML_class_diagrams.pdf
accesso riservato
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
310.24 kB
Formato
Adobe PDF
|
310.24 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3003614
