Network monitoring represents a key step for several applications, such as cyber-security and traffic engineering. Examples of the data include packet traces captured in the network and log files obtained from services like the DNS and BGP. It is widely known that monitoring may expose privacy-sensitive information. Deep packet inspection, for example, exposes the destination servers contacted by users, and non-encrypted fields of certain protocols, such as Service Name Indication (SNI) in TLS handshakes. New privacy regulations (e.g. GDPR) impose strict rules when handling data that carry privacy-sensitive information. They guarantee the protection of personal data, provide the interested parties certain rights, and assign powers to the regulators to enforce them. As network monitoring data carries information that reveals users' identity, it must be treated in the light of these regulations. Network monitoring infrastructure must guarantee that sensitive information is not leaked or, preferably, must not collect any unnecessary data that may threat users' privacy. Historically, the solution to these problems has been anonymization -- i.e., replacing sensitive fields with obfuscated copies. This approach however has two drawbacks: First, anonymization reduces the value of the collected information. For instance, while anonymizing client and server IP addresses in traffic logs helps to protect privacy, it renders it impossible to evaluate particular services that could be identified by their server IP addresses. Second, anonymization of protocol fields in isolation is not sufficient, as users' identity might be revealed by subtler techniques. For example, even if one obfuscates the client IP addresses in DNS traffic logs, the set of hostnames resolved by a client (if exposed in the logs) may still help to uncover identities. We are building a flexible tool that exposes to monitors only the information strictly required, thus reducing at the source risks to people's privacy. Our solution satisfies three requirements: (i)~it automatically searches for protocol fields that can be linked to particular users; (ii)~it anonymizes information considering all protocol stack, and uses a stateful approach, employing k-anonymization algorithms; (iii)~it is light-weight and scalable, thus deployable in high-speed links at multiple Gb/s. Our solution is based on the Intel Data Plane Development Kit, a set of libraries and drivers for fast packet processing. We have built a prototype that is deployed in a campus network. At the present, the prototype is able to handle multiple 10~Gb/s links with zero packet losses, performing several anonymization steps on packets. Anonymized packets are forwarded to legacy monitoring systems that receive information already deprived of privacy sensitive fields. We are testing k-anonymization approaches to perform selective anonymization of sensitive fields, such as TLS SNIs and server IP addresses, with the aim to obfuscate only cases in which the information helps to uncover users behind the traffic. In this poster we will present our architecture and system design, as well as show preliminary results of the prototype deployment.

Privacy-preserving network monitoring at high-speed / Favale, Thomas; Mellia, Marco; Drago, Idilio; Trevisan, Martino. - STAMPA. - (2019). (Intervento presentato al convegno ACM IMC 2019 tenutosi a Amsterdam (NL) nel 21/10/2019 - 23/10/2019).

Privacy-preserving network monitoring at high-speed

FAVALE, THOMAS;Marco Mellia;Idilio Drago;Martino Trevisan
2019

Abstract

Network monitoring represents a key step for several applications, such as cyber-security and traffic engineering. Examples of the data include packet traces captured in the network and log files obtained from services like the DNS and BGP. It is widely known that monitoring may expose privacy-sensitive information. Deep packet inspection, for example, exposes the destination servers contacted by users, and non-encrypted fields of certain protocols, such as Service Name Indication (SNI) in TLS handshakes. New privacy regulations (e.g. GDPR) impose strict rules when handling data that carry privacy-sensitive information. They guarantee the protection of personal data, provide the interested parties certain rights, and assign powers to the regulators to enforce them. As network monitoring data carries information that reveals users' identity, it must be treated in the light of these regulations. Network monitoring infrastructure must guarantee that sensitive information is not leaked or, preferably, must not collect any unnecessary data that may threat users' privacy. Historically, the solution to these problems has been anonymization -- i.e., replacing sensitive fields with obfuscated copies. This approach however has two drawbacks: First, anonymization reduces the value of the collected information. For instance, while anonymizing client and server IP addresses in traffic logs helps to protect privacy, it renders it impossible to evaluate particular services that could be identified by their server IP addresses. Second, anonymization of protocol fields in isolation is not sufficient, as users' identity might be revealed by subtler techniques. For example, even if one obfuscates the client IP addresses in DNS traffic logs, the set of hostnames resolved by a client (if exposed in the logs) may still help to uncover identities. We are building a flexible tool that exposes to monitors only the information strictly required, thus reducing at the source risks to people's privacy. Our solution satisfies three requirements: (i)~it automatically searches for protocol fields that can be linked to particular users; (ii)~it anonymizes information considering all protocol stack, and uses a stateful approach, employing k-anonymization algorithms; (iii)~it is light-weight and scalable, thus deployable in high-speed links at multiple Gb/s. Our solution is based on the Intel Data Plane Development Kit, a set of libraries and drivers for fast packet processing. We have built a prototype that is deployed in a campus network. At the present, the prototype is able to handle multiple 10~Gb/s links with zero packet losses, performing several anonymization steps on packets. Anonymized packets are forwarded to legacy monitoring systems that receive information already deprived of privacy sensitive fields. We are testing k-anonymization approaches to perform selective anonymization of sensitive fields, such as TLS SNIs and server IP addresses, with the aim to obfuscate only cases in which the information helps to uncover users behind the traffic. In this poster we will present our architecture and system design, as well as show preliminary results of the prototype deployment.
2019
File in questo prodotto:
File Dimensione Formato  
Poster IMC.pdf

accesso aperto

Tipologia: Altro materiale allegato
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 634.07 kB
Formato Adobe PDF
634.07 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2766737
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo