I Will Try to Fix You: Large Language Models for Mobile GUI Test Repair

Fulcini, Tommaso; Poletti, Alessandro; Arnaudo, Anna; Coppola, Riccardo

doi:10.1109/SANER-C67878.2026.00052

Mobile applications evolve rapidly, forcing development teams to deliver updates at high frequency while maintaining reliable testing processes. In this context, GUI testing is essential for validating user-facing behavior; however, it is highly affected by fragility, as tests break across application versions not due to functional defects, but rather because of changes in the GUI structure, appearance, or properties. This leads to substantial manual effort to diagnose failures and repair outdated tests. This study aims to: (i) identify the most common causes of mobile GUI test breakages across application versions; (ii) assess how Large Language Models (LLMs) can reduce the effort required for repairing broken tests; and (iii) compare an LLMbased repair strategy with a state-of-the-art automated repair tool, Healenium-Appium. A total of 61 broken GUI tests from 19 real-world Android applications were analyzed to identify the underlying causes of breakage. We then developed an LLM-based repair approach, in which the model and user iteratively review test failures and progressively update test scripts until the repaired test passes. The approach was experimentally evaluated against HealeniumAppium. The LLM-based method successfully repaired 45 out of 61 tests (73.8%) after a single interaction and 56 out of 61 tests (91.8 %) after multiple interactions, outperforming Healenium-Appium by 50.8% and 68.8%, respectively. These results show that LLM-based repair effectively mitigates GUI test fragility, reducing maintenance effort while achieving higher repair success rates than existing automated tools.

I Will Try to Fix You: Large Language Models for Mobile GUI Test Repair / Fulcini, T., Poletti, A., Arnaudo, A., Coppola, R.. - (2026), pp. 341-348. (Workshop on Validation, Analysis and Evolution of Software Tests Limassol (CY) 17-20 March 2026) [10.1109/SANER-C67878.2026.00052].

I Will Try to Fix You: Large Language Models for Mobile GUI Test Repair

Tommaso Fulcini;Alessandro Poletti;Anna Arnaudo;Riccardo Coppola

2026

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2026
			
	Codice ISBN
	
				979-8-3315-8589-1
			
	Appare nelle tipologie
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
SANER_2026_paper_504 (1).pdf accesso aperto Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: Pubblico - Tutti i diritti riservati Dimensione 331.81 kB Formato Adobe PDF Visualizza/Apri	331.81 kB	Adobe PDF	Visualizza/Apri
I_Will_Try_to_Fix_You_Large_Language_Models_for_Mobile_GUI_Test_Repair.pdf accesso riservato Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 412.24 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	412.24 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3009409

PORTO @ Archivio Istituzionale della Ricerca

I Will Try to Fix You: Large Language Models for Mobile GUI Test Repair

Tommaso Fulcini;Alessandro Poletti;Anna Arnaudo;Riccardo Coppola

2026

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)