DarkVec: automatic analysis of darknet traffic with word embeddings

Gioacchini, Luca; Vassio, Luca; Mellia, Marco; Drago, Idilio; Houidi, Zied Ben; Rossi, Dario

doi:10.1145/3485983.3494863

Darknets are passive probes listening to traffic reaching IP addresses that host no services. Traffic reaching them is unsolicited by nature and often induced by scanners, malicious senders and misconfigured hosts. Its peculiar nature makes it a valuable source of information to learn about malicious activities. However, the massive amount of packets and sources that reach darknets makes it hard to extract meaningful insights. In particular, multiple senders contact the darknet while performing similar and coordinated tasks, which are often commanded by common controllers (botnets, crawlers, etc.). How to automatically identify and group those senders that share similar behaviors remains an open problem. We here introduce DarkVec, a methodology to identify clusters of senders (i.e., IP addresses) engaged in similar activities on darknets. DarkVec leverages word embedding techniques (e.g., Word2Vec) to capture the co-occurrence patterns of sources hitting the darknets. We extensively test DarkVec and explore its design space in a case study using one month of darknet data. We show that with a proper definition of service, the generated embeddings can be easily used to (i) associate unknown senders' IP addresses to the correct known labels (more than 96% accuracy), and (ii) identify new attack and scan groups of previously unknown senders. We contribute DarkVec source code and datasets to the community also to stimulate the use of word embeddings to automatically learn patterns on generic traffic traces.

DarkVec: automatic analysis of darknet traffic with word embeddings / Gioacchini, Luca; Vassio, Luca; Mellia, Marco; Drago, Idilio; Houidi, Zied Ben; Rossi, Dario. - STAMPA. - (2021), pp. 76-89. (Intervento presentato al convegno CoNEXT '21: 17th International Conference on emerging Networking EXperiments and Technologies tenutosi a Virtual Event Germany nel December 7 - 10, 2021) [10.1145/3485983.3494863].

DarkVec: automatic analysis of darknet traffic with word embeddings

Gioacchini, Luca;Vassio, Luca;Mellia, Marco;Drago, Idilio;Houidi, Zied Ben;Rossi, Dario

2021

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
			2021
		
	Codice ISBN
	
			9781450390989
		
	Appare nelle tipologie
	
			4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
3485983.3494863.pdf non disponibili Descrizione: Articolo pubblicato Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 1.58 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.58 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
darkvec_preprint.pdf accesso aperto Descrizione: Post-print Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: PUBBLICO - Tutti i diritti riservati Dimensione 1.61 MB Formato Adobe PDF Visualizza/Apri	1.61 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2944874

PORTO @ Archivio Istituzionale della Ricerca

DarkVec: automatic analysis of darknet traffic with word embeddings

Gioacchini, Luca;Vassio, Luca;Mellia, Marco;Drago, Idilio;Houidi, Zied Ben;Rossi, Dario

2021

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)