Mobile applications evolve rapidly, forcing development teams to deliver updates at high frequency while maintaining reliable testing processes. In this context, GUI testing is essential for validating user-facing behavior; however, it is highly affected by fragility, as tests break across application versions not due to functional defects, but rather because of changes in the GUI structure, appearance, or properties. This leads to substantial manual effort to diagnose failures and repair outdated tests. This study aims to: (i) identify the most common causes of mobile GUI test breakages across application versions; (ii) assess how Large Language Models (LLMs) can reduce the effort required for repairing broken tests; and (iii) compare an LLMbased repair strategy with a state-of-the-art automated repair tool, Healenium-Appium. A total of 61 broken GUI tests from 19 real-world Android applications were analyzed to identify the underlying causes of breakage. We then developed an LLM-based repair approach, in which the model and user iteratively review test failures and progressively update test scripts until the repaired test passes. The approach was experimentally evaluated against HealeniumAppium. The LLM-based method successfully repaired 45 out of 61 tests (73.8%) after a single interaction and 56 out of 61 tests (91.8 %) after multiple interactions, outperforming Healenium-Appium by 50.8% and 68.8%, respectively. These results show that LLM-based repair effectively mitigates GUI test fragility, reducing maintenance effort while achieving higher repair success rates than existing automated tools.

I Will Try to Fix You: Large Language Models for Mobile GUI Test Repair / Fulcini, T., Poletti, A., Arnaudo, A., Coppola, R.. - (2026), pp. 341-348. (Workshop on Validation, Analysis and Evolution of Software Tests Limassol (CY) 17-20 March 2026) [10.1109/SANER-C67878.2026.00052].

I Will Try to Fix You: Large Language Models for Mobile GUI Test Repair

Tommaso Fulcini;Alessandro Poletti;Anna Arnaudo;Riccardo Coppola
2026

Abstract

Mobile applications evolve rapidly, forcing development teams to deliver updates at high frequency while maintaining reliable testing processes. In this context, GUI testing is essential for validating user-facing behavior; however, it is highly affected by fragility, as tests break across application versions not due to functional defects, but rather because of changes in the GUI structure, appearance, or properties. This leads to substantial manual effort to diagnose failures and repair outdated tests. This study aims to: (i) identify the most common causes of mobile GUI test breakages across application versions; (ii) assess how Large Language Models (LLMs) can reduce the effort required for repairing broken tests; and (iii) compare an LLMbased repair strategy with a state-of-the-art automated repair tool, Healenium-Appium. A total of 61 broken GUI tests from 19 real-world Android applications were analyzed to identify the underlying causes of breakage. We then developed an LLM-based repair approach, in which the model and user iteratively review test failures and progressively update test scripts until the repaired test passes. The approach was experimentally evaluated against HealeniumAppium. The LLM-based method successfully repaired 45 out of 61 tests (73.8%) after a single interaction and 56 out of 61 tests (91.8 %) after multiple interactions, outperforming Healenium-Appium by 50.8% and 68.8%, respectively. These results show that LLM-based repair effectively mitigates GUI test fragility, reducing maintenance effort while achieving higher repair success rates than existing automated tools.
2026
979-8-3315-8589-1
File in questo prodotto:
File Dimensione Formato  
SANER_2026_paper_504 (1).pdf

accesso aperto

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Pubblico - Tutti i diritti riservati
Dimensione 331.81 kB
Formato Adobe PDF
331.81 kB Adobe PDF Visualizza/Apri
I_Will_Try_to_Fix_You_Large_Language_Models_for_Mobile_GUI_Test_Repair.pdf

accesso riservato

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 412.24 kB
Formato Adobe PDF
412.24 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3009409