Privacy regulation laws, such as GDPR, impose transparency and security as design pillars for data processing algorithms. In this context, federated learning is one of the most influential frameworks for privacy-preserving distributed machine learning, achieving astounding results in many natural language processing and computer vision tasks. Several federated learning frameworks employ differential privacy to prevent private data leakage to unauthorized parties and malicious attackers. Many studies, however, highlight the vulnerabilities of standard federated learning to poisoning and inference, thus raising concerns about potential risks for sensitive data. To address this issue, we present SGDE, a generative data exchange protocol that improves user security and machine learning performance in a cross-silo federation. The core of SGDE is to share data generators with strong differential privacy guarantees trained on private data instead of communicating explicit gradient information. These generators synthesize an arbitrarily large amount of data that retain the distinctive features of private samples but differ substantially. In this work, SGDE is tested in a cross-silo federated network on images and tabular datasets, exploiting beta-variational autoencoders as data generators. From the results, the inclusion of SGDE turns out to improve task accuracy and fairness, as well as resilience to the most influential attacks on federated learning.
SGDE: Secure Generative Data Exchange for Cross-Silo Federated Learning / Lomurno, Eugenio; Archetti, Alberto; Cazzella, Lorenzo; Samele, Stefano; Di Perna, Leonardo; Matteucci, Matteo. - ELETTRONICO. - 0:(2022), pp. 205-214. (Intervento presentato al convegno AIPR 2022, 5th International Conference on Artificial Intelligence and Pattern Recognition tenutosi a Xiamen (CHN) nel September 23-25, 2022) [10.1145/3573942.3573974].
SGDE: Secure Generative Data Exchange for Cross-Silo Federated Learning
Archetti, Alberto;Matteucci, Matteo
2022
Abstract
Privacy regulation laws, such as GDPR, impose transparency and security as design pillars for data processing algorithms. In this context, federated learning is one of the most influential frameworks for privacy-preserving distributed machine learning, achieving astounding results in many natural language processing and computer vision tasks. Several federated learning frameworks employ differential privacy to prevent private data leakage to unauthorized parties and malicious attackers. Many studies, however, highlight the vulnerabilities of standard federated learning to poisoning and inference, thus raising concerns about potential risks for sensitive data. To address this issue, we present SGDE, a generative data exchange protocol that improves user security and machine learning performance in a cross-silo federation. The core of SGDE is to share data generators with strong differential privacy guarantees trained on private data instead of communicating explicit gradient information. These generators synthesize an arbitrarily large amount of data that retain the distinctive features of private samples but differ substantially. In this work, SGDE is tested in a cross-silo federated network on images and tabular datasets, exploiting beta-variational autoencoders as data generators. From the results, the inclusion of SGDE turns out to improve task accuracy and fairness, as well as resilience to the most influential attacks on federated learning.File | Dimensione | Formato | |
---|---|---|---|
SGDE_preprint.pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Pubblico - Tutti i diritti riservati
Dimensione
562.3 kB
Formato
Adobe PDF
|
562.3 kB | Adobe PDF | Visualizza/Apri |
3573942.3573974.pdf
accesso riservato
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
1.03 MB
Formato
Adobe PDF
|
1.03 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2971578