Skip to main content
Log in

Parallelization of the self-organized maps algorithm for federated learning on distributed sources

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

This paper describes a formally based approach for parallelizing the Kohonen algorithm used for the federated learning process in a special kind of neural networks—Self-Organizing Maps. Our approach enables executing the parallel algorithm version on the distributed data sources, taking into account the kind of data distribution on the nodes. Compared to the traditional approaches, we distinguish two kinds of data distributions—horizontal and vertical: for both, our suggested approach avoids gathering data in a single storage, but rather moves computations nearer to the data source nodes. This reduces the execution time of the algorithm, the network traffic, and the risk of an unauthorized access to the data during their transmission. Our experimental evaluation demonstrates the advantages of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Dehghani Z (2019) How to move beyond a monolithic data lake to a distributed data mesh. https://martinfowler.com/articles/data-monolith-to-mesh.html

  2. Voigt P, Von dem Bussche A (2017) The EU general data protection regulation (GDPR). In: A practical guide, 1st ed. Springer International Publishing, Cham

  3. California Consumer Privacy Act Home Page. https://www.caprivacy.org/

  4. Konecný J, Brendan McMahan H, Ramage D, Richtárik P (2016) Federated optimization: distributed machine learning for on-device intelligence. arXiv:CoRRabs/1610.02527(2016)

  5. Yang Q, Liu Y, Chen T, Tong Y (2019) Federated machine learning: concept and applications. ACM Trans Intell Syst Technol 10(2):12

    Article  Google Scholar 

  6. Kohonen T (2001) Self-organizing maps (Third Extended Edition), New York

  7. Kholod I, Shorov A, Efimova M, Gorlatch S (2019) Parallelization of algorithms for mining data from distributed sources. PaCT-2019. Springer. LNCS, pp 289–303 https://doi.org/10.1007/978-3-030-25636-4_23

  8. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning: data mining, inference, and prediction. Springer

  9. Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In Proceedings of Operating Systems Design and Implementation. San Francisco, CA

  10. Gorlatch S, Cole M (2011) Parallel Skeletons. In: Padua D (ed.) Encyclopedia of parallel computing. Springer

  11. Lawrence RD, Almasi GS, Rushmeier HE (1999) A scalable parallel algorithm for selfor-ganizing maps with applications to sparse data mining problems. Data Min Knowl Disc 3(2):171–195

    Article  Google Scholar 

  12. Fort J, Letrémy P, Cottrell M (2002) Advantages and drawbacks of the Batch Kohonen algo-rithm. ESANN

  13. Weichel Ch (2010) Adapting self-organizing maps to the mapreduce programming paradigm. STeP, pp 119–131. https://doi.org/10.1524/9783486853162.119

  14. Sarazin T, Azzag H, Lebbah M (2014) SOM Clustering using spark-mapreduce. In: 2014 IEEE 28th International Parallel & Distributed Processing Symposium Workshops, pp 1727–1734 https://doi.org/10.1109/IPDPSW.2014.192

  15. Dafonte C, Garabato D, Álvarez MA, Manteiga M (2018) Distributed fast self-organized maps for massive spectrophotometric data analysis. Sensors (Basel) 18(5):1419. Published 2018 May 3. https://doi.org/10.3390/s18051419

  16. Flavius LG, Jose Alfredo FC (2008) Parallel self-organizing maps with application in clustering distributed data. Neural Networks. IJCNN 2008. IEEE International Joint Conference on IEEE World Congress on Computational Intelligence

  17. Li Q, et al (2020) Federated learning systems: vision, hype and reality for data privacy and protection. arXiv:abs/1907.09693

  18. Ingerman A, Ostrowski K (2019) Introducing TensorFlow Federated https://blog.tensorflow.org/2019/03/introducing-tensorflow-federated.html

  19. Ryffel Th, Trask A, Dahl M, Wagner B, Mancuso J, Rueckert D, Passerat-Palmbach J (2018) A generic framework for privacy preserving deep learning. preprint arXiv:1811.04017

  20. An Industrial Grade Federated Learning Framework https://fate.fedai.org/

  21. Paddle Federated Learning https://github.com/PaddlePaddle/PaddleFL

  22. Kholod I, Kuprianov M, Titkov E, Shorov A, Postnikova E, Mironenko I, Sokolov S (2019) Training normal Bayes classifier on distributed data. Proc Comput Sci 150:389–396. https://doi.org/10.1016/j.procs.2019.02.068

    Article  Google Scholar 

  23. Kholod I, Rukavitsyn A, Reva N, Shorov A (2019) Distributed data clustering by neural network algorithms. In: Proceedings of the 2019 IEEE Russia Section Young Researchers in Electrical and Electronic Engineering Conference—IEEE. pp 249–253. https://doi.org/10.1109/EIConRus.2019.8657175

  24. https://github.com/Awethon/SOM-FuncBlock

  25. https://github.com/iiholod/XelopesFL

Download references

Acknowledgements

We are grateful to the anonymous reviewers whose very helpful comments allowed us to significantly improve. This work was supported by the German Ministry of Education and Research (BMBF) in the framework of project HPC2SE at the University of Muenster.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ivan Kholod.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kholod, I., Rukavitsyn, A., Paznikov, A. et al. Parallelization of the self-organized maps algorithm for federated learning on distributed sources. J Supercomput 77, 6197–6213 (2021). https://doi.org/10.1007/s11227-020-03509-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-020-03509-2

Keywords

Navigation