Towards a Playground to Democratize Experimentation and Benchmarking of AI Agents for Network Troubleshooting

Wang, Zhihao; Cornacchia, Alessandro; Galante, Franco; Centofanti, Carlo; Sacco, Alessio; Jiang, Dingde

doi:10.1145/3748496.3748990

Artificial Intelligence (AI) and Large Language Models (LLMs), are increasingly finding application in network-related tasks, such as network configuration synthesis and dialogue-based interfaces to network measurements, among others. In this preliminary work, we restrict our focus to the application of AI agents to network troubleshooting and elaborate on the need for a standardized, reproducible, and open benchmarking platform, where to build and evaluate AI agents with low operational effort. This platform primarily aims at standardize and democratize the experimentation with AI agents, by enabling researchers and practitioners — including non-domain experts such as ML/AI engineers— to evaluate AI agents on curated problem sets, without concerns for underlying operational complexities. We present a modular and extensible benchmarking framework that supports widely adopted network emulators. It targets an extensible set of network issues in diverse real-world scenarios – e.g., data centers, access, WAN, etc. – and orchestrates the end-to-end evaluation workflows, including failure injection, telemetry instrumentation and collection, and agent performance evaluation. Agents can be easily connected through a single Application Programming Interface (API) to an emulation platform and rapidly evaluated. The code is publicly available at https://github.com/zhihao1998/LLM4NetLab.

Towards a Playground to Democratize Experimentation and Benchmarking of AI Agents for Network Troubleshooting / Wang, Zhihao; Cornacchia, Alessandro; Galante, Franco; Centofanti, Carlo; Sacco, Alessio; Jiang, Dingde. - ELETTRONICO. - (2025), pp. 1-3. ( 1st Workshop on Next-Generation Network Observability (NGNO) Coimbra (PRT) September 8 - 11, 2025) [10.1145/3748496.3748990].

Towards a Playground to Democratize Experimentation and Benchmarking of AI Agents for Network Troubleshooting

Wang, Zhihao;Cornacchia, Alessandro;Galante, Franco;Centofanti, Carlo;Sacco, Alessio;Jiang, Dingde

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2025
			
	Codice ISBN
	
				9798400720871
			
	Appare nelle tipologie
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
3748496.3748990.pdf accesso aperto Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Creative commons Dimensione 1.04 MB Formato Adobe PDF Visualizza/Apri	1.04 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3004654

PORTO @ Archivio Istituzionale della Ricerca

Towards a Playground to Democratize Experimentation and Benchmarking of AI Agents for Network Troubleshooting

Wang, Zhihao;Cornacchia, Alessandro;Galante, Franco;Centofanti, Carlo;Sacco, Alessio;Jiang, Dingde

2025

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)