The Sweet Danger of Sugar: Debunking Representation Learning for Encrypted Traffic Classification

Zhao, Yuqi; Dettori, Giovanni; Boffa, Matteo; Vassio, Luca; Mellia, Marco

doi:10.1145/3718958.3750498

Recently we have witnessed the explosion of proposals that, inspired by Language Models like BERT, exploit Representation Learning models to create traffic representations. All of them promise astonishing performance in encrypted traffic classification (up to 98% accuracy). In this paper, with a networking expert mindset, we critically reassess their performance. Through extensive analysis, we demonstrate that the reported successes are heavily influenced by data preparation problems, which allow these models to find easy shortcuts - spurious correlation between features and labels - during fine-tuning that unrealistically boost their performance. When such shortcuts are not present - as in real scenarios - these models perform poorly. We also introduce Pcap-Encoder, an LM-based representation learning model that we specifically design to extract features from protocol headers. Pcap-Encoder appears to be the only model that provides an instrumental representation for traffic classification. Yet, its complexity questions its applicability in practical settings. Our findings reveal flaws in dataset preparation and model training, calling for a better and more conscious test design. We propose a correct evaluation methodology and stress the need for rigorous benchmarking.

The Sweet Danger of Sugar: Debunking Representation Learning for Encrypted Traffic Classification / Zhao, Yuqi; Dettori, Giovanni; Boffa, Matteo; Vassio, Luca; Mellia, Marco. - ELETTRONICO. - (2025), pp. 296-310. (Intervento presentato al convegno SIGCOMM '25: ACM SIGCOMM 2025 Conference tenutosi a Coimbra (PT) nel September 8 - 11, 2025) [10.1145/3718958.3750498].

The Sweet Danger of Sugar: Debunking Representation Learning for Encrypted Traffic Classification

Zhao, Yuqi;Dettori, Giovanni;Boffa, Matteo;Vassio, Luca;Mellia, Marco

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2025
			
	Codice ISBN
	
				979-8-4007-1524-2
			
	Appare nelle tipologie
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
3718958.3750498.pdf accesso aperto Descrizione: Final version Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Creative commons Dimensione 1.07 MB Formato Adobe PDF Visualizza/Apri	1.07 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3003539

PORTO @ Archivio Istituzionale della Ricerca

The Sweet Danger of Sugar: Debunking Representation Learning for Encrypted Traffic Classification

Zhao, Yuqi;Dettori, Giovanni;Boffa, Matteo;Vassio, Luca;Mellia, Marco

2025

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)