With the advent of big data and the birth of the data markets that sell personal information, individuals’ privacy is of utmost importance. The classical response is anonymization, i.e., sanitizing the information that can directly or indirectly allow users’ re-identification. The most popular solution in the literature is the k-anonymity. However, it is hard to achieve k-anonymity on a continuous stream of data, as well as when the number of dimensions becomes high.In this paper, we propose a novel anonymization property called z-anonymity. Differently from k-anonymity, it can be achieved with zero-delay on data streams and it is well suited for high dimensional data. The idea at the base of z-anonymity is to release an attribute (an atomic information) about a user only if at least z − 1 other users have presented the same attribute in a past time window. z-anonymity is weaker than k-anonymity since it does not work on the combinations of attributes, but treats them individually. In this paper, we present a probabilistic framework to map the z-anonymity into the k-anonymity property. Our results show that a proper choice of the z-anonymity parameters allows the data curator to likely obtain a k-anonymized dataset, with a precisely measurable probability. We also evaluate a real use case, in which we consider the website visits of a population of users and show that z-anonymity can work in practice for obtaining the k-anonymity too.

z-anonymity: Zero-Delay Anonymization for Data Streams / Jha, Nikhil; Favale, Thomas; Vassio, Luca; Trevisan, Martino; Mellia, Marco. - ELETTRONICO. - (2021), pp. 3996-4005. ((Intervento presentato al convegno 2020 IEEE International Conference on Big Data (Big Data) tenutosi a Atlanta, GA, USA nel 10-13 Dec. 2020 [10.1109/BigData50022.2020.9378422].

z-anonymity: Zero-Delay Anonymization for Data Streams

Jha, Nikhil;Favale, Thomas;Vassio, Luca;Trevisan, Martino;Mellia, Marco
2021

Abstract

With the advent of big data and the birth of the data markets that sell personal information, individuals’ privacy is of utmost importance. The classical response is anonymization, i.e., sanitizing the information that can directly or indirectly allow users’ re-identification. The most popular solution in the literature is the k-anonymity. However, it is hard to achieve k-anonymity on a continuous stream of data, as well as when the number of dimensions becomes high.In this paper, we propose a novel anonymization property called z-anonymity. Differently from k-anonymity, it can be achieved with zero-delay on data streams and it is well suited for high dimensional data. The idea at the base of z-anonymity is to release an attribute (an atomic information) about a user only if at least z − 1 other users have presented the same attribute in a past time window. z-anonymity is weaker than k-anonymity since it does not work on the combinations of attributes, but treats them individually. In this paper, we present a probabilistic framework to map the z-anonymity into the k-anonymity property. Our results show that a proper choice of the z-anonymity parameters allows the data curator to likely obtain a k-anonymized dataset, with a precisely measurable probability. We also evaluate a real use case, in which we consider the website visits of a population of users and show that z-anonymity can work in practice for obtaining the k-anonymity too.
978-1-7281-6251-5
978-1-7281-6252-2
File in questo prodotto:
File Dimensione Formato  
09378422.pdf

non disponibili

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 1.06 MB
Formato Adobe PDF
1.06 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
z_anonymity__Zero_Delay_Anonymization_for_Data_Streams (2).pdf

accesso aperto

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 496.76 kB
Formato Adobe PDF
496.76 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2878858