An Analysis of the Test Repair Capability of LLMs in Android GUI Testing

Fedriga, Alessandro; Fulcini, Tommaso; Coppola, Riccardo; Amalfitano, Domenico; Distante, Damiano; Ricca, Filippo

Automated GUI testing is a crucial activity in modern Android development, yet its nature is notoriously fragile, especially when the GUI is built using dynamic frameworks like Jetpack Compose. Minor UI changes frequently break tests, flooding continuous integration pipelines with false positives, and burdening developers with costly repairs. To reduce test repair effort, we evaluate a developer-in-the-loop approach leveraging a Large Language Model, GitHub Copilot with Claude 3.7 Sonnet, as a zero-shot repair agent within Android Studio. By analyzing IDE context, this method updates broken selectors, adjusts test oracles, and maintains test semantics after GUI changes. We empirically evaluate the approach using the Bitwarden mobile app for Android, an open-source project containing 1083 GUI tests. We analyzed the test suite across two recent application versions, reporting failures and using the LLM to repair the failing tests. Our evaluation investigates test fragility, effectiveness of zero-shot LLM-based repairs, additional benefits from retrying prompts, and improvements from brief developer interactions. Results show that a single zero-shot prompt recovers a significant proportion of failing tests, reducing manual maintenance efforts. Another retry provided minimal additional benefits, whereas brief developer interactions considerably enhanced recovery rates. Our findings indicate that integrating LLM-driven techniques substantially eases the maintenance burden of GUI test suites, ensuring robustness against rapid UI evolution in Jetpack Compose applications.

An Analysis of the Test Repair Capability of LLMs in Android GUI Testing / Fedriga, Alessandro; Fulcini, Tommaso; Coppola, Riccardo; Amalfitano, Domenico; Distante, Damiano; Ricca, Filippo. - ELETTRONICO. - (In corso di stampa). ( 18th International Conference on the Quality of Information and Communications Technology Lisbon (POR) 3-5 September 2025).

An Analysis of the Test Repair Capability of LLMs in Android GUI Testing

Alessandro Fedriga;Tommaso Fulcini;Riccardo Coppola;Domenico Amalfitano;Damiano Distante;Filippo Ricca

In corso di stampa

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Anno del prodotto

In corso di stampa

Appare nelle tipologie

4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Quatic_LLM_test_repair (3).pdf accesso riservato Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 502.74 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	502.74 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3002251

PORTO @ Archivio Istituzionale della Ricerca

An Analysis of the Test Repair Capability of LLMs in Android GUI Testing

Alessandro Fedriga;Tommaso Fulcini;Riccardo Coppola;Domenico Amalfitano;Damiano Distante;Filippo Ricca

In corso di stampa

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)