Large Datasets Visualization with Neural Network Using Clustered Training Data

Ivanikovas, Sergėjus; Dzemyda, Gintautas; Medvedev, Viktor

doi:10.1007/978-3-540-85713-6_11

Sergėjus Ivanikovas^1,2,
Gintautas Dzemyda^1,2 &
Viktor Medvedev^1,2

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5207))

Included in the following conference series:

East European Conference on Advances in Databases and Information Systems

541 Accesses
3 Citations

Abstract

This paper presents the visualization of large datasets with SAMANN algorithm using clustering methods for initial dataset reduction for the network training. The visualization of multidimensional data is highly important in data mining because recent applications produce large amount of data that need specific means for the knowledge discovery. One of the ways to visualize multidimensional dataset is to project it onto a plane. This paper analyzes the visualization of multidimensional data using feed-forward neural network. We investigate an unsupervised backpropagation algorithm to train a multilayer feed-forward neural network (SAMANN) to perform the Sammon‘s nonlinear projection. The SAMANN network offers the generalization ability of projecting new data. Previous investigations showed that it is possible to train SAMANN using only a part of analyzed dataset without the loss of accuracy. It is very important to select proper vector subset for the neural network training. One of the ways to construct relevant training subset is to use clustering. This allows to speed up the visualization of large datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Google Scholar
Biswas, G., Jain, A.K., Dubes, R.C.: Evaluation of Projection Algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence 3(6), 701–708 (1981)
Article Google Scholar
Borg, I., Groenen, P.: Modern Multidimensional Scaling: Theory and Applications. Springer, Heidelberg (1997)
MATH Google Scholar
Dzemyda, G., Kurasova, O., Medvedev, V.: Dimension Reduction and Data Visualization Using Neural Networks. In: Emerging Artificial Intelligence Applications in Computer Engineering – Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies. Frontiers in Artificial Intelligence and Applications, vol. 160, pp. 25–49. IOS Press, Amsterdam (2007)
Google Scholar
Faloutsos, C., Lin, K.: FastMap: a fast algorithm for indexing, datamining, and visualization. In: Proc. ACM SIGMOD, pp. 163–174 (1995)
Google Scholar
Ivanikovas, S., Medvedev, V., Dzemyda, G.: Parallel Realizations of the SAMANN Algorithm. In: Beliczynski, B., Dzielinski, A., Iwanowski, M., Ribeiro, B. (eds.) ICANNGA 2007. LNCS, vol. 4432, pp. 179–188. Springer, Heidelberg (2007)
Chapter Google Scholar
Kohonen, T., Oja, E.: Visual Feature Analysis by the Self-Organising Maps. Neural Computing & Applications 7(3), 273–286 (1998)
Article Google Scholar
Lee, R.C.T., Slagle, J.R., Blum, H.: A Triangulation Method for Sequential Mapping of Points from n-Space to Two-Space. IEEE Transactions on Computers 27, 288–299 (1977)
Article Google Scholar
Lowe, D., Tipping, M.E.: Feed-forward Neural Networks and Topographic Mappings for Exploratory Data Analysis. Neural Computing and Applications 4, 83–95 (1996)
Article Google Scholar
MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)
Google Scholar
Mao, J., Jain, A.K.: Artificial Neural Networks for Feature Extraction and Multivariate Data Projection. IEEE Trans. Neural Networks 6, 296–317 (1995)
Article Google Scholar
Naud, A.: An Accurate MDS-Based Algorithm for the Visualization of Large Multidimensional Datasets. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 643–652. Springer, Heidelberg (2006)
Chapter Google Scholar
de Ridder, D., Duin, R.P.W.: Sammon’s Mapping Using Neural Networks: A comparison. Pattern Recognition Letters 18, 1307–1316 (1997)
Article Google Scholar
Sammon, J.W.: A Nonlinear Napping for Data Structure Analysis. IEEE Transactions on Computers 18, 401–409 (1969)
Article Google Scholar
de Silva, V., Tenenbaum, J.B.: Global versus local methods in nonlinear dimensionality reduction. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Proc. NIPS, vol. 15, pp. 721–728 (2003)
Google Scholar
Wang, J.T.-L., Wang, X., Lin, K.-I., Shasha, D., Shapiro, B.A., Zhang, K.: Evaluating a class of distance-mapping algorithms for data mining and clustering. In: Proc. ACM KDD, pp. 307–311 (1999)
Google Scholar
van Wezel, M.C., Kosters, W.A.: Nonmetric Multidimensional Scaling: Neural networks Versus Traditional Techniques. Intelligent Data Analysis 8(6), 601–613 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Mathematics and Informatics, , Akademijos str. 4, LT-08663, Vilnius, Lithuania
Sergėjus Ivanikovas, Gintautas Dzemyda & Viktor Medvedev
Vilnius Pedagogical University, Studentu̧ str. 39, LT-08106, Vilnius, Lithuania
Sergėjus Ivanikovas, Gintautas Dzemyda & Viktor Medvedev

Authors

Sergėjus Ivanikovas
View author publications
You can also search for this author in PubMed Google Scholar
Gintautas Dzemyda
View author publications
You can also search for this author in PubMed Google Scholar
Viktor Medvedev
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Paolo Atzeni Albertas Caplinskas Hannu Jaakkola

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ivanikovas, S., Dzemyda, G., Medvedev, V. (2008). Large Datasets Visualization with Neural Network Using Clustered Training Data. In: Atzeni, P., Caplinskas, A., Jaakkola, H. (eds) Advances in Databases and Information Systems. ADBIS 2008. Lecture Notes in Computer Science, vol 5207. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85713-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-540-85713-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85712-9
Online ISBN: 978-3-540-85713-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics