Recently, Google proposed the Topics API framework as a privacy-friendly alternative for behavioural advertising as a possible solution to balance user's privacy and advertisement effectiveness. Using the Topics API, the browser builds a user profile based on navigation history, which advertisers can access. The Topics API aim at becoming the new standard for behavioural advertising, thus it is necessary to fully understand its operation and find possible limitations. In this article, we evaluate the robustness of the Topics API to a re-identification attack. To build a user profile, we suppose an attacker accumulates over time the topics a user exposes to different websites. The attacker later re-identifies the same user matching the profiles of their audience. We leverage real traffic traces and realistic population models, and we present increasingly powerful attack threats. We find that the Topics API mitigates but cannot prevent re-identification from taking place, as there is a sizeable chance that a user's profile remains unique within a website's audience and the attacker successfully matches it with the profile of the same user on a second website. Depending on environmental factors, the probability of correct re-identification can reach 50%, considering a pool of 1, 000 users. We offer the code and data we use in this work to stimulate further studies and the tuning of the Topic API parameters.
Re-Identification Attacks against the Topics API / Jha, Nikhil; Trevisan, Martino; Leonardi, Emilio; Mellia, Marco. - In: ACM TRANSACTIONS ON THE WEB. - ISSN 1559-1131. - ELETTRONICO. - 18:3(2024), pp. 1-24. [10.1145/3675400]
Re-Identification Attacks against the Topics API
Jha, Nikhil;Trevisan, Martino;Leonardi, Emilio;Mellia, Marco
2024
Abstract
Recently, Google proposed the Topics API framework as a privacy-friendly alternative for behavioural advertising as a possible solution to balance user's privacy and advertisement effectiveness. Using the Topics API, the browser builds a user profile based on navigation history, which advertisers can access. The Topics API aim at becoming the new standard for behavioural advertising, thus it is necessary to fully understand its operation and find possible limitations. In this article, we evaluate the robustness of the Topics API to a re-identification attack. To build a user profile, we suppose an attacker accumulates over time the topics a user exposes to different websites. The attacker later re-identifies the same user matching the profiles of their audience. We leverage real traffic traces and realistic population models, and we present increasingly powerful attack threats. We find that the Topics API mitigates but cannot prevent re-identification from taking place, as there is a sizeable chance that a user's profile remains unique within a website's audience and the attacker successfully matches it with the profile of the same user on a second website. Depending on environmental factors, the probability of correct re-identification can reach 50%, considering a pool of 1, 000 users. We offer the code and data we use in this work to stimulate further studies and the tuning of the Topic API parameters.File | Dimensione | Formato | |
---|---|---|---|
3675400.pdf
accesso aperto
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
1.5 MB
Formato
Adobe PDF
|
1.5 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2994443