With the advent of big data and the emergence of data markets, preserving individuals’ privacy has become of utmost importance. The classical response to this need is anonymization, i.e., sanitizing the information that, directly or indirectly, can allow users’ re-identification. Among the various approaches, -anonymity provides a simple and easy-to-understand protection. However, -anonymity is challenging to achieve in a continuous stream of data and scales poorly when the number of attributes becomes high. In this paper, we study a novel anonymization property called -anonymity that we explicitly design to deal with data streams, i.e., where the decision to publish a given attribute (atomic information) is made in real time. The idea at the base of -anonymity is to release such attribute about a user only if at least other users have exposed the same attribute in a past time window. Depending on the value of , the output stream results -anonymized with a certain probability. To this end, we present a probabilistic model to map the -anonymity into the -anonymity property. The model is not only helpful in studying the -anonymity property, but also general enough to evaluate the probability of achieving -anonymity in data streams, resulting in a generic contribution.
Practical anonymization for data streams: z-anonymity and relation with k-anonymity / Jha, Nikhil; Vassio, Luca; Trevisan, Martino; Leonardi, Emilio; Mellia, Marco. - In: PERFORMANCE EVALUATION. - ISSN 0166-5316. - STAMPA. - 159:(2023), p. 102329. [10.1016/j.peva.2022.102329]
Practical anonymization for data streams: z-anonymity and relation with k-anonymity
Nikhil Jha;Luca Vassio;Martino Trevisan;Emilio Leonardi;Marco Mellia
2023
Abstract
With the advent of big data and the emergence of data markets, preserving individuals’ privacy has become of utmost importance. The classical response to this need is anonymization, i.e., sanitizing the information that, directly or indirectly, can allow users’ re-identification. Among the various approaches, -anonymity provides a simple and easy-to-understand protection. However, -anonymity is challenging to achieve in a continuous stream of data and scales poorly when the number of attributes becomes high. In this paper, we study a novel anonymization property called -anonymity that we explicitly design to deal with data streams, i.e., where the decision to publish a given attribute (atomic information) is made in real time. The idea at the base of -anonymity is to release such attribute about a user only if at least other users have exposed the same attribute in a past time window. Depending on the value of , the output stream results -anonymized with a certain probability. To this end, we present a probabilistic model to map the -anonymity into the -anonymity property. The model is not only helpful in studying the -anonymity property, but also general enough to evaluate the probability of achieving -anonymity in data streams, resulting in a generic contribution.File | Dimensione | Formato | |
---|---|---|---|
1-s2.0-S0166531622000372-main.pdf
non disponibili
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
2.22 MB
Formato
Adobe PDF
|
2.22 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
z_anonymity__Zero_Delay_Anonymization_for_Data_Streams__EXTENDED_.pdf
Open Access dal 31/12/2023
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Creative commons
Dimensione
740.44 kB
Formato
Adobe PDF
|
740.44 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2974412