Adapting Self-Organizing Map Algorithm to Sparse Data

Melka, Josué; Mariage, Jean-Jacques

doi:10.1007/978-3-030-16469-0_8

Josué Melka⁶ &
Jean-Jacques Mariage⁶

Part of the book series: Studies in Computational Intelligence ((SCI,volume 829))

Included in the following conference series:

International Joint Conference on Computational Intelligence

282 Accesses
3 Citations

Abstract

Machine learning techniques applied to data-mining face the challenge of time and memory requirements, and for this purpose should make full profit of the increase in power that recent multi-core processors bring. When applied to sparse data, it is also sometimes necessary to find an appropriate reformulation of the algorithms, keeping in mind that memory load was and still is an issue. In [1], we presented a mathematical reformulation of the standard and the batch versions of the Self-Organizing Map algorithm for sparse data, proposed a parallel implementation of the batch version, and carried out initial performance evaluation tests. We here reproduce and extend our experiments on a more powerful hardware architecture and compare the results to our previous ones. A thorough quantitative and qualitative analysis confirms our preceding results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Using the squared distance here is equivalent to using the euclidean distance, and avoids the square root computation.
2.
Complexity of Eq. (2) does not depend on vector size and it is only \(\mathcal {O}(T M)\).
3.
We used version 1.7.4. Later versions of Somoclu have been improved by us and use more efficient sparse computation (see: https://github.com/peterwittek/somoclu/commit/d5ffcf250db77aa103a9de96968ef0e27dc14d15).

References

Melka, J., Mariage, J.: Efficient implementation of self-organizing map for sparse input data. In: Proceedings of the 9th International Joint Conference on Computational Intelligence, IJCCI 2017, pp. 54–63, Funchal, Madeira, Portugal (2017)
Google Scholar
Ultsch, A.: Data mining and knowledge discovery with emergent self-organizing feature maps for multivariate time series. Kohonen Maps 46, 33–46 (1999)
Article Google Scholar
Kohonen, T.: Self-organized formation of topologically correct feature maps. Biol. Cybern. 43, 59–69 (1982)
Article MathSciNet MATH Google Scholar
Honkela, T., Kaski, S., Lagus, K., Kohonen, T.: Newsgroup exploration with WEBSOM method and browsing interface. Technical Report A32, Helsinki University of Technology (1996)
Google Scholar
Kaski, S., Honkela, T., Lagus, K., Kohonen, T.: WEBSOM-self-organizing maps of document collections. Neurocomputing 21, 101–117 (1998)
Article MATH Google Scholar
Ultsch, A., Mörchen, F.: ESOM-Maps: tools for clustering, visualization, and classification with Emergent SOM. Technical Report 46, Department of Mathematics and Computer Science, University of Marburg, Germany (2005)
Google Scholar
Polzlbauer, G., Dittenbach, M., Rauber, A.: A visualization technique for self-organizing maps with vector fields to obtain the cluster structure at desired levels of detail. In: Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, 2005, IJCNN’05, vol. 3, pp. 1558–1563. IEEE (2005)
Google Scholar
Vesanto, J., Ahola, J.: Hunting for correlations in data using the self-organizing map. In: Proceeding of the International ICSC Congress on Computational Intelligence Methods and Applications (CIMA’99), pp. 279–285. ICSC Academic Press (1999)
Google Scholar
Carpenter, G.A., Grossberg, S.: Art 2: self-organization of stable category recognition codes for analog input patterns. Appl. Opt. 26, 4919–4930 (1987)
Article Google Scholar
He, J., Tan, A.-H., Tan, C.-L.: Modified ART 2A growing network capable of generating a fixed number of nodes. IEEE Trans. Neural Netw. 15, 728–737 (2004)
Google Scholar
Carpenter, G.A., Grossberg, S., Rosen, D.B.: ART 2-A: an adaptive resonance algorithm for rapid category learning and recognition. Neural Netw. 4, 493–504 (1991)
Article Google Scholar
Wittek, P., Gao, S.C., Lim, I.S., Zhao, L.: Somoclu: an efficient parallel library for self-organizing maps. J. Stat. Softw. 78, 1–21 (2017)
Article Google Scholar
Liao, G., Chen, P., Du, L., Su, L., Liu, Z., Tang, Z., Shi, T.: Using SOM neural network for X-ray inspection of missing-bump defects in three-dimensional integration. Microelectron. Reliab. 55, 2826–2832 (2015)
Article Google Scholar
Kohonen, T.: Self-Organizing Maps. 2nd edn. Springer Series in Information Sciences, vol. 30. Springer, Berlin (1997)
Google Scholar
Kohonen, T.: Things you haven’t heard about the self-organizing map. In: 1993 IEEE International Conference on Neural Networks, pp. 1147–1156 (1993)
Google Scholar
Mulier, F., Cherkassky, V.: Self-organization as an iterative Kernel smoothing process. Neural Comput. 7, 1165–1177 (1995)
Article Google Scholar
Cheng, Y.: Convergence and ordering of Kohonen’s batch map. Neural Comput. 9, 1667–1676 (1997)
Article Google Scholar
Ienne, P., Thiran, P., Vassilas, N.: Modified self-organizing feature map algorithms for efficient digital hardware implementation. IEEE Trans. Neural Netw. 8, 315–330 (1997)
Article Google Scholar
Lawrence, R.D., Almasi, G.S., Rushmeier, H.E.: A scalable parallel algorithm for self-organizing maps with applications to sparse data mining problems. Data Min. Knowl. Discov. 3, 171–195 (1999)
Article Google Scholar
Maiorana, F.: Performance improvements of a Kohonen self organizing classification algorithm on sparse data sets. In: Proceedings of the 10th WSEAS International Conference on Mathematical Methods, Computational Techniques and Intelligent Systems, MAMECTIS’08, pp. 347–352. World Scientific and Engineering Academy and Society (WSEAS) (2008)
Google Scholar
Natarajan, R.: Exploratory data analysis in large, sparse datasets. Technical Report, IBM Thomas J. Watson Research Division (1997)
Google Scholar
Roussinov, D.G., Chen, H.: A scalable self-organizing map algorithm for textual classification: a neural network approach to thesaurus generation. Commun. Cogn. Artif. Intell. J. (1998)
Google Scholar
Kohonen, T.: Essentials of the self-organizing map. Neural Netw. 37, 52–65 (2013)
Article Google Scholar
Olteanu, M., Villa-Vialaneix, N.: Sparse online self-organizing maps for large relational data. In: Advances in Self-Organizing Maps and Learning Vector Quantization (Proceedings of WSOM 2016). Advances in Intelligent Systems and Computing, vol. 428, pp. 27–37. Springer, Houston, Texas, USA (2016)
Google Scholar
Wu, C.H., Hodges, R.E., Wang, C.J.: Parallelizing the self-organizing feature map on multiprocessor systems. Parallel Comput. 17, 821–832 (1991)
Article MATH Google Scholar
Seiffert, U., Michaelis, B.: Multi-dimensional self-organizing maps on massively parallel hardware. In: Advances in Self-Organising Maps, pp. 160–166. Springer, Berlin (2001)
Google Scholar
Guan, H., Li, C.K., Cheung, T.Y., Yu, S.: Parallel design and implementation of SOM neural computing model in PVM environment of a distributed system. In: Proceedings of the Advances in Parallel and Distributed Computing, pp. 26–31. IEEE (1997)
Google Scholar
Bandeira, N., Lobo, V., Moura-Pires, F.: Training a Self-Organizing Map distributed on a PVM network. In: 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence, vol. 1, pp. 457–461 (1998)
Google Scholar
Tomsich, P., Rauber, A., Merkl, D.: Optimizing the parSOM neural network implementation for data mining with distributed memory systems and cluster computing. In: Proceedings 11th International Workshop on Database and Expert Systems Applications, pp. 661–665. IEEE (2000)
Google Scholar
Labonté, G., Quintin, M.: Network parallel computing for SOM neural networks. In: High Performance Computing Systems and Applications, pp. 575–586. Springer, Berlin (2002)
Google Scholar
Hämäläinen, T.D.: Parallel implementations of self-organizing maps. In: Seiffert, U., Jain, L.C. (eds.) Self-Organizing Neural Networks, pp. 245–278. Springer, New York (2002)
Chapter Google Scholar
Campbell, A., Berglund, E., Streit, A.: Graphics hardware implementation of the parameter-less self-organising map. In: International Conference on Intelligent Data Engineering and Automated Learning, pp. 343–350. Springer, Berlin (2005)
Google Scholar
Moraes, F.C., Botelho, S.C., Duarte Filho, N., Gaya, J.F.O.: Parallel high dimensional self organizing maps using CUDA. In: Robotics Symposium and Latin American Robotics Symposium (SBR-LARS), pp. 302–306. IEEE, Brazilian (2012)
Google Scholar
Richardson, T., Winer, E.: Extending parallelization of the self-organizing map by combining data and network partitioned methods. Adv. Eng. Softw. 88, 1–7 (2015)
Article Google Scholar
Daneshpajouh, H., Delisle, P., Boisson, J.C., Krajecki, M., Zakaria, N.: Parallel batch self-organizing map on graphics processing unit using CUDA. In: Latin American High Performance Computing Conference, pp. 87–100. Springer, Berlin (2017)
Google Scholar
Wittek, P., Darányi, S.: Accelerating text mining workloads in a MapReduce-based distributed GPU environment. J. Parallel Distrib. Comput. 73, 198–206 (2013)
Article Google Scholar
Kohonen, T., Kaski, S., Lagus, K., Salojarvi, J., Honkela, J., Paatero, V., Saarela, A.: Self organization of a massive document collection. IEEE Trans. Neural Netw. 11, 574–585 (2000)
Article Google Scholar
Lagus, K., Kaski, S., Kohonen, T.: Mining massive document collections by the WEBSOM method. Inf. Sci. 163, 135–156 (2004)
Article Google Scholar
Takatsuka, M., Bui, M.: Parallel batch training of the self-organizing map using openCL. In: Neural Information Processing: Models and Applications, pp. 470–476. Springer, Berlin (2010)
Google Scholar
Nordström, T.: Designing parallel computers for self organizing maps. In: Proceedings of the 4th Swedish Workshop on Computer System Architecture (DSA-92), pp. 13–15 (1992)
Google Scholar
Dagum, L., Menon, R.: OpenMP: an industry standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5, 46–55 (1998)
Article Google Scholar
Yang, M.H., Ahuja, N.: A data partition method for parallel self-organizing map. In: International Joint Conference on Neural Networks, IJCNN’99, vol. 3, pp. 1929–1933. IEEE (1999)
Google Scholar
Silva, B., Marques, N.: A hybrid parallel SOM algorithm for large maps in data-mining. New Trends in Artificial Intelligence (2007)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM data: classification (Multi Class). https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html (2006)
Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Identifying suspicious urls: an application of large-scale online learning. In: Proceedings of the 26th annual international conference on machine learning, pp. 681–688. ACM (2009)
Google Scholar
Stamper, J., Niculescu-Mizil, A., Ritter, S., Gordon, G., Koedinger, K.: Bridge to Algebra 2008–2009, Challenge data set from KDD Cup 2010 Educational Data Mining Challenge (2010)
Google Scholar
Juan, Y., Zhuang, Y., Chin, W.S., Lin, C.J.: Field-aware factorization machines for CTR prediction. In: Proceedings of the 10th ACM Conference on Recommender Systems, pp. 43–50. ACM (2016)
Google Scholar
Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
Google Scholar
Lang, K.: Newsweeder: Learning to filter netnews. In: Proceedings of the 12th International Conference on Machine Learning, pp. 331–339 (1995)
Google Scholar
McCallum, A., Nigam, K.: A Comparison of event models for Naive Bayes text classification. In: AAAI/ICML-98 Workshop on Learning for Text Categorization. Technical Report WS-98-05, pp. 41–48 (1998)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)
Article Google Scholar
Hull, J.J.: A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell. 16, 550–554 (1994)
Article Google Scholar
Wang, J.Y.: Application of support vector machines in bioinformatics. Ph.D. Thesis, National Taiwan University (2002)
Google Scholar
Noordewier, M.O., Towell, G.G., Shavlik, J.W.: Training knowledge-based neural networks to recognize genes in DNA sequences. Adv. Neural Inf. Process. Syst. 3, 530–536 (1991)
Google Scholar
King, R.D., Feng, C., Sutherland, A.: StatLog: comparison of classi cation algorithms on large real-world problems. Appl. Artif. Intell. Int. J. 9, 289–333 (1995)
Article Google Scholar
Frey, P.W., Slate, D.J.: Letter recognition using Holland-style adaptive classifiers. Mach. Learn. 6, 161–182 (1991)
Article Google Scholar
Fort, J.C., Letremy, P., Cottrell, M.: Advantages and drawbacks of the Batch Kohonen algorithm. ESANN 2, 223–230 (2002)
Google Scholar
Nöcker, M., Mörchen, F., Ultsch, A.: An algorithm for fast and reliable ESOM learning. In: ESANN, 14th European Symposium on Artificial Neural Networks, pp. 131–136 (2006)
Google Scholar
Kohonen, T., Hynninen, J., Kangas, J., Laaksonen, J.: SOM\(\_\)PAK: The self-organizing map program package. Report A31, Helsinki University of Technology, Laboratory of Computer and Information Science (1996)
Google Scholar
Kiviluoto, K.: Topology preservation in self-organizing maps. In: IEEE International Conference on Neural Networks, vol. 1, pp. 294–299 (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire d’Informatique Avancée de Saint-Denis, Université Paris 8, 2 Rue de la Liberté, Saint-Denis, France
Josué Melka & Jean-Jacques Mariage

Authors

Josué Melka
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Jacques Mariage
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Josué Melka .

Editor information

Editors and Affiliations

IUT Sénart, Université Paris-Est Créteil (UPEC), Créteil, France
Christophe Sabourin
Department of Computer Architecture and Technology, University of Granada, Granada, Spain
Juan Julian Merelo
Université Paris-Est Créteil (UPEC), Créteil, France
Kurosh Madani
University of Reading, Reading, UK
Kevin Warwick

Appendix

1.1 Perf Analysis of Serial Runs

Note: sparse-bsom-v2 is a variation of the Sparse-BSom algorithm with outer loop on data and inner loop on codebook in BMU search.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Melka, J., Mariage, JJ. (2019). Adapting Self-Organizing Map Algorithm to Sparse Data. In: Sabourin, C., Merelo, J.J., Madani, K., Warwick, K. (eds) Computational Intelligence. IJCCI 2017. Studies in Computational Intelligence, vol 829. Springer, Cham. https://doi.org/10.1007/978-3-030-16469-0_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-16469-0_8
Published: 30 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16468-3
Online ISBN: 978-3-030-16469-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Adapting Self-Organizing Map Algorithm to Sparse Data

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Perf Analysis of Serial Runs

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation