QATCH: Automatic Evaluation of SQL-Centric Tasks on Proprietary Data

Papicchio, Simone; Papotti, Paolo; Cagliero, Luca

doi:10.1145/3712704

Tabular Representation Learning (TRL) and Large Language Models (LLMs) have become established for tackling Question Answering (QA) and Semantic Parsing (SP) tasks on tabular data. State-of-the-art models are pre-trained and evaluated on large open-domain datasets. However, the performance on existing QA and SP benchmarks is not necessarily representative of that achieved on proprietary data as the characteristics of the input and the complexity of the posed queries show high variability. To tackle this challenge, our goal is to allow end-users to evaluate TRL and LLM performance on their own proprietary data. We present Query-Aided TRL CHecklist (QATCH), a toolbox to automatically generate a testing checklist tailored to QA and SP. QATCH provides a testing suite highlighting models' strengths and weaknesses on relational tables unseen at training time. The proposed toolbox relies on a SQL query generator that crafts tests of varying types and complexity including, amongst others, tests on null values, projection, selections, joins, group by, and having clauses. QATCH also supports a set of general cross-task performance metrics providing more insights into SQL-related model capabilities than currently used metrics. The empirical results, achieved by state-of-the-art TRL models and LLMs, show substantial performance differences (1) between existing benchmarks and proprietary data, (2) across queries of different complexity.

QATCH: Automatic Evaluation of SQL-Centric Tasks on Proprietary Data / Papicchio, Simone; Papotti, Paolo; Cagliero, Luca. - In: ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY. - ISSN 2157-6904. - 16:2(2025), pp. 1-26. [10.1145/3712704]

QATCH: Automatic Evaluation of SQL-Centric Tasks on Proprietary Data

Papicchio, Simone;Papotti, Paolo;Cagliero, Luca

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2025
			
	Codice DOI
	
				https://dx.doi.org/10.1145/3712704
			
	Titolo della Rivista
	
				ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
3712704.pdf accesso aperto Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Creative commons Dimensione 3.02 MB Formato Adobe PDF Visualizza/Apri	3.02 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3002591

PORTO @ Archivio Istituzionale della Ricerca

QATCH: Automatic Evaluation of SQL-Centric Tasks on Proprietary Data

Papicchio, Simone;Papotti, Paolo;Cagliero, Luca

2025

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)