Mobile applications evolve rapidly, forcing development teams to deliver updates at high frequency while maintaining reliable testing processes. In this context, GUI testing is essential for validating user-facing behavior; however, it is highly affected by fragility, as tests break across application versions not due to functional defects, but rather because of changes in the GUI structure, appearance, or properties. This leads to substantial manual effort to diagnose failures and repair outdated tests. This study aims to: (i) identify the most common causes of mobile GUI test breakages across application versions; (ii) assess how Large Language Models (LLMs) can reduce the effort required for repairing broken tests; and (iii) compare an LLMbased repair strategy with a state-of-the-art automated repair tool, Healenium-Appium. A total of 61 broken GUI tests from 19 real-world Android applications were analyzed to identify the underlying causes of breakage. We then developed an LLM-based repair approach, in which the model and user iteratively review test failures and progressively update test scripts until the repaired test passes. The approach was experimentally evaluated against HealeniumAppium. The LLM-based method successfully repaired 45 out of 61 tests (73.8%) after a single interaction and 56 out of 61 tests (91.8 %) after multiple interactions, outperforming Healenium-Appium by 50.8% and 68.8%, respectively. These results show that LLM-based repair effectively mitigates GUI test fragility, reducing maintenance effort while achieving higher repair success rates than existing automated tools.
I Will Try to Fix You: Large Language Models for Mobile GUI Test Repair / Fulcini, T., Poletti, A., Arnaudo, A., Coppola, R.. - (2026), pp. 341-348. (Workshop on Validation, Analysis and Evolution of Software Tests Limassol (CY) 17-20 March 2026) [10.1109/SANER-C67878.2026.00052].
I Will Try to Fix You: Large Language Models for Mobile GUI Test Repair
Tommaso Fulcini;Alessandro Poletti;Anna Arnaudo;Riccardo Coppola
2026
Abstract
Mobile applications evolve rapidly, forcing development teams to deliver updates at high frequency while maintaining reliable testing processes. In this context, GUI testing is essential for validating user-facing behavior; however, it is highly affected by fragility, as tests break across application versions not due to functional defects, but rather because of changes in the GUI structure, appearance, or properties. This leads to substantial manual effort to diagnose failures and repair outdated tests. This study aims to: (i) identify the most common causes of mobile GUI test breakages across application versions; (ii) assess how Large Language Models (LLMs) can reduce the effort required for repairing broken tests; and (iii) compare an LLMbased repair strategy with a state-of-the-art automated repair tool, Healenium-Appium. A total of 61 broken GUI tests from 19 real-world Android applications were analyzed to identify the underlying causes of breakage. We then developed an LLM-based repair approach, in which the model and user iteratively review test failures and progressively update test scripts until the repaired test passes. The approach was experimentally evaluated against HealeniumAppium. The LLM-based method successfully repaired 45 out of 61 tests (73.8%) after a single interaction and 56 out of 61 tests (91.8 %) after multiple interactions, outperforming Healenium-Appium by 50.8% and 68.8%, respectively. These results show that LLM-based repair effectively mitigates GUI test fragility, reducing maintenance effort while achieving higher repair success rates than existing automated tools.| File | Dimensione | Formato | |
|---|---|---|---|
|
SANER_2026_paper_504 (1).pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
331.81 kB
Formato
Adobe PDF
|
331.81 kB | Adobe PDF | Visualizza/Apri |
|
I_Will_Try_to_Fix_You_Large_Language_Models_for_Mobile_GUI_Test_Repair.pdf
accesso riservato
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
412.24 kB
Formato
Adobe PDF
|
412.24 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3009409
