Skip to main content

Advertisement

Log in

Privacy-preserved federated clustering with Non-IID data via GANs

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Federated clustering (FedC) is designed to cluster participants by utilizing global similarity measures and then training on independent clusters to enhance global accuracy. As an unsupervised federated learning approach, FedC operates on distributed and unlabeled data while upholding privacy. However, it faces challenges, such as non-independent and identically distributed (Non-IID) data on clients rendering the global clustering structure fragile, and potential privacy leaks through shared gradients. In response, this study introduces GFC-DP, a privacy-preserving federated clustering algorithm tailored for Non-IID data using generative adversarial networks (GANs), to address both data heterogeneity and privacy protection concerns. The algorithm incorporates GANs to generate synthetic data, leveraging global information to construct robust clustering structures. Notably, as the first work introducing a client selection strategy in GANs model training, it enhances the performance of global GANs models by defining a client evaluation equation and subsequently selecting better-performing clients to participate in GANs model training. Additionally, Gaussian noise is introduced during GANs model training to bolster privacy and counter model inversion and membership inference attacks. One-shot FedC is performed on the client side based on global centroids to obtain a stable global clustering structure. We conducted comprehensive experiments on the MNIST, Cifar-10, Rotated MNIST, and Rotated Cifar-10 datasets. The results demonstrate that, in Non-IID scenarios, GFC-DP achieves superior accuracy in both GANs performance and clustering effectiveness compared to similar algorithms in image classification tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Algorithm 2
Algorithm 3
Algorithm 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data Availability

All data generated or analyzed during this study can be accessed by tensorflow/g3doc/tutorials/mnist/ and http://www.cs.toronto.edu/kriz/cifar-10-python.tar.gz.

References

  1. McMahan HB, Moore E, Ramage D, Hampson S, y Arcas BA (2016) Communication-efficient learning of deep networks from decentralized data. In: International Conference on Artificial Intelligence and Statistics, pp 1273–1282

  2. Yang Q, Liu Y, Chen T, Tong Y (2019) Federated machine learning: concept and applications. ACM Trans Intell Syst Technol (TIST) 10(2):1–19

    Article  MATH  Google Scholar 

  3. Nguyen DC, Pham Q-V, Pathirana PN, Ding M, Seneviratne AP, Lin Z, Dobre OA, Hwang WJ (2021) Federated learning for smart healthcare: a survey. ACM Comput Surv (CSUR) 55:1–37

    Google Scholar 

  4. Zhang J, Zhou J, Guo J, Sun X (2023) Visual object detection for privacy-preserving federated learning. IEEE Access 11:33324–33335

    Article  MATH  Google Scholar 

  5. Hosseinzadeh M, Hemmati A, Rahmani AM (2022) Federated learning-based iot: a systematic literature review. Int J Commun Syst 35(11):e5185

    Article  Google Scholar 

  6. Zhang F, Kuang K, Chen L, You Z, Shen T, Xiao J, Zhang Y, Wu C, Wu F, Zhuang Y, Li X (2023) Federated unsupervised representation learning. Front Inf Technol Electron Eng 24(8):1181–1193. https://doi.org/10.1631/FITEE.2200268

    Article  Google Scholar 

  7. Ghosh A, Chung J, Yin D, Ramchandran K (2020) An efficient framework for clustered federated learning. IEEE Trans Inf Theory 68:8076–8091

    Article  MathSciNet  MATH  Google Scholar 

  8. Yoon T, Shin S, Hwang SJ, Yang E (2021) Fedmix: approximation of mixup under mean augmented federated learning. arXiv:2107.00233

  9. Lu L, Lin Y, Wen Y, Zhu J, Xiong S (2023) Federated clustering for recognizing driving styles from private trajectories. Eng Appl Artif Intell 118:105714

    Article  Google Scholar 

  10. Li Y, Wang S, Chi C-Y, Quek TQS (2023) Differentially private federated clustering over non-iid data. arXiv:2301.00955

  11. Sattler F, Müller K, Samek W (2019) Clustered federated learning: model-agnostic distributed multi-task optimization under privacy constraints. arXiv:1910.01991

  12. Kolluri A, Baluta T, Saxena P (2021) Private hierarchical clustering in federated networks. Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security

  13. Zhu H, Xu J, Liu S, Jin Y (2021) Federated learning on non-iid data: a survey. arXiv:2106.06843

  14. Nishio T, Yonetani R (2018) Client selection for federated learning with heterogeneous resources in mobile edge. ICC 2019 - 2019 IEEE International Conference on Communications (ICC), 1–7

  15. Dennis DK, Li T, Smith V (2021) Heterogeneity for the win: one-shot federated clustering. In: International Conference on Machine Learning, pp 2611–2620. https://api.semanticscholar.org/CorpusID:232075682

  16. Hong J, Wang H, Wang Z, Zhou J (2022) Efficient split-mix federated learning for on-demand and in-situ customization. arXiv:2203.09747

  17. Augenstein S, McMahan HB, Ramage D, Ramaswamy S, Kairouz P, Chen M, Mathews R, y Arcas BA (2020) Generative models for effective ml on private, decentralized datasets

  18. Mukherjee S, Asnani H, Lin E, Kannan S (2018) Clustergan: latent space clustering in generative adversarial networks. In: AAAI Conference on Artificial Intelligence, pp 4610–4617. https://api.semanticscholar.org/CorpusID:52188737

  19. Yoon T, Shin S, Hwang SJ, Yang E (2021) Fedmix: approximation of mixup under mean augmented federated learning. arXiv:2107.00233

  20. Wang K, Deng N, Li X (2023) An efficient content popularity prediction of privacy preserving based on federated learning and wasserstein gan. IEEE Internet Things J 10:3786–3798

    Article  MATH  Google Scholar 

  21. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein gan. arXiv:1701.07875

  22. Jie M, Long G, Zhou T, Jiang J, Zhang C (2022) On the convergence of clustered federated learning. arXiv:2202.06187

  23. Yan J, Liu J, Qi J, Zhang Z (2022) Federated clustering with gan-based data synthesis. arXiv:2210.16524

  24. Mohassel P, Zhang Y (2017) Secureml: a system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy (SP), pp 19–38. IEEE

  25. Wei K, Li J, Ding M, Ma C, Yang HH, Farokhi F, Jin S, Quek TQS, Poor HV (2020) Federated learning with differential privacy: Algorithms and performance analysis. IEEE Trans Inf Forensics Secur 15:3454–3469. https://doi.org/10.1109/TIFS.2020.2988575

    Article  MATH  Google Scholar 

  26. Geyer R, Klein T, Nabi M (2017) Differentially private federated learning: a client level perspective. arXiv:1712.07557

  27. Chamikara MAP, Liu D, Camtepe S, Nepal S, Grobler M, Bertók P, Khalil I (2022) Local differential privacy for federated learning in industrial settings. arXiv:2202.06053

  28. Shokri R, Stronati M, Song C, Shmatikov V (2016) Membership inference attacks against machine learning models. 2017 IEEE Symposium on Security and Privacy (SP), 3–18

  29. Fredrikson M, Jha S, Ristenpart T (2015) Model inversion attacks that exploit confidence information and basic countermeasures. Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security

  30. Torkzadehmahani R, Kairouz P, Paten BJ (2019) Dp-cgan: Differentially private synthetic data and label generation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 98–104

  31. Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434

  32. Kim Y, Lee W (2022) Distributed raman spectrum data augmentation system using federated learning with deep generative models. Sensors (Basel, Switzerland) 22

  33. Chuenbubpha T, Boonchoo T, Haga J, Rattanatamrong P (2023) Solving non-iid in federated learning for image classification using gans. In: 2023 20th International Joint Conference on Computer Science and Software Engineering (JCSSE), pp. 333–338. https://doi.org/10.1109/JCSSE58229.2023.10202100

  34. Li Z, Shao J, Mao Y, Wang JH, Zhang J (2022) Federated learning with gan-based data synthesis for non-iid clients. In: FL@IJCAI, pp. 17–32. https://api.semanticscholar.org/CorpusID:249626271

  35. Gad G, Fadlullah ZM (2022) Federated learning via augmented knowledge distillation for heterogenous deep human activity recognition systems. Sensors (Basel, Switzerland) 23

  36. Dwork C, McSherry F, Nissim K, Smith AD (2006) Calibrating noise to sensitivity in private data analysis. In: Theory of Cryptography Conference, pp. 265–284. https://api.semanticscholar.org/CorpusID:2468323

  37. Dwork C, Feldman V, Hardt M, Pitassi T, Reingold O, Roth AL (2015) Preserving statistical validity in adaptive data analysis. In: Proceedings of the Forty-seventh Annual ACM Symposium on Theory of Computing, pp. 117–126

  38. Zhang L, Shen B, Barnawi A, Xi S, Kumar N, Wu Y (2021) Feddpgan: Federated differentially private generative adversarial networks framework for the detection of covid-19 pneumonia. Inf Syst Front 23:1403–1415

    Article  MATH  Google Scholar 

  39. Stallmann M, Wilbik A (2022) Towards federated clustering: a federated fuzzy c-means algorithm (ffcm). arXiv:2201.07316

  40. Liu B, Guo Y, Chen X (2021) Pfa: privacy-preserving federated adaptation for effective model personalization. Proceedings of the Web Conference 2021

  41. Sattler F, Müller K-R, Samek W (2019) Clustered federated learning: model-agnostic distributed multitask optimization under privacy constraints. IEEE Trans Neural Netw Learn Syst 32:3710–3722

    Article  MathSciNet  MATH  Google Scholar 

  42. Liu Y, Peng J, Yu JJQ, Wu Y (2019) Ppgan: privacy-preserving generative adversarial network. 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS), 985–989

  43. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: NIPS, pp. 342–398. https://api.semanticscholar.org/CorpusID:1033682

  44. Liu S, Qian Y, Hao Y (2024) Balancing privacy and attack utility: calibrating sample difficulty for membership inference attacks in transfer learning. In: 2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks - Supplemental Volume (DSN-S), pp 159–160. https://doi.org/10.1109/DSN-S60304.2024.00046

  45. Chen L, Zhao D, Tao L, Wang K, Qiao S, Zeng X, Tan CW (2024) A credible and fair federated learning framework based on blockchain. IEEE Trans Artif Intell. https://doi.org/10.1109/TAI.2024.3355362

    Article  MATH  Google Scholar 

  46. Zhu T, Ye D, Zhou S, Liu B, Zhou W (2023) Label-only model inversion attacks: Attack with the least information. IEEE Trans Inf Forensics Secur 18:991–1005. https://doi.org/10.1109/TIFS.2022.3233190

    Article  MATH  Google Scholar 

  47. Chen L, Zhang W, Dong C, Huang Z, Nie Y, Hou Z, Qiao S, Tan CW (2024) Feddrl: trustworthy federated learning model fusion method based on staged reinforcement learning. Comput Inform 43(1):1–37. https://doi.org/10.31577/cai_2024_1_1

    Article  Google Scholar 

  48. Zhao C, Gao Z, Wang Q, Mo Z, Yu X (2022) Fedgan: A federated semi-supervised learning from non-iid data. In: Wang L, Segal M, Chen J, Qiu T (eds) Wireless Algorithms, Systems, and Applications. Springer, Cham, pp 181–192

    Chapter  MATH  Google Scholar 

  49. Wijesinghe A, Zhang S, Ding Z (2024) Ps-fedgan: An efficient federated learning framework with strong data privacy. IEEE Internet Things J 11(16):27584–27596. https://doi.org/10.1109/JIOT.2024.3399226

    Article  Google Scholar 

  50. Singh R, Liu F, Sun Y, Shroff NB (2024) Multi-armed bandits with dependent arms. Mach Learn 113(1):45–71. https://doi.org/10.1007/S10994-023-06457-Z

    Article  MathSciNet  MATH  Google Scholar 

  51. Wakayama S, Ahmed N (2024) Observation-augmented contextual multi-armed bandits for robotic search and exploration. IEEE Robot Autom Lett 9(10):8531–8538. https://doi.org/10.1109/LRA.2024.3448133

    Article  MATH  Google Scholar 

  52. Abadi M, Chu A, Goodfellow I, McMahan HB, Mironov I, Talwar K, Zhang L (2016) Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, ???. https://doi.org/10.1145/2976749.2978318

  53. Chen L, Zhang W, Dong C, Zhao D, Zeng X, Qiao S, Zhu Y, Tan CW (2024) Fedtkd: a trustworthy heterogeneous federated learning based on adaptive knowledge distillation. Entropy. https://doi.org/10.3390/e26010096

    Article  Google Scholar 

  54. Ribeiro B, Gomes L, Barbarroxa R, Vale ZA (2023) A novel framework for multiagent knowledge-based federated learning systems. In: Practical Applications of Agents and Multi-Agent Systems, pp 296–306. https://api.semanticscholar.org/CorpusID:259938844

Download references

Funding

This work was supported by the National Natural Science Foundation of China under Grant No. 62102074 and the Natural Science Foundation of Liaoning Province No. 2024-MSBA-49.

Author information

Authors and Affiliations

Authors

Contributions

Jianzhe Zhao involved in conceptualization and methodology. Wenji Wang involved in data curation, software, and writing—original draft preparation. Jiabao Wang involved in visualization, software, and investigation. Songyang Zhang involved in software and validation. Linzhe Fan involved in software, writing—reviewing and editing. Stan Matwin involved in conceptualization and writing—reviewing.

Corresponding author

Correspondence to Wenji Wang.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Ethical approval

It is not applicable for the study since it does not involve humans or animals.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, J., Wang, W., Wang, J. et al. Privacy-preserved federated clustering with Non-IID data via GANs. J Supercomput 81, 512 (2025). https://doi.org/10.1007/s11227-025-07006-2

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-025-07006-2

Keywords