Avoiding Prototype Proliferation in Incremental Vector Quantization of Large Heterogeneous Datasets

Satizábal, Héctor F.; Pérez-Uribe, Andres; Tomassini, Marco

doi:10.1007/978-3-642-04512-7_13

Héctor F. Satizábal⁴,
Andres Pérez-Uribe⁵ &
Marco Tomassini⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 258))

1233 Accesses

Abstract

Vector quantization of large datasets can be carried out by means of an incremental modelling approach where the modelling task is transformed into an incremental task by partitioning or sampling the data, and the resulting datasets are processed by means of an incremental learner. Growing Neural Gas is an incremental vector quantization algorithm with the capabilities of topology-preserving and distribution-matching. Distribution matching can produce overpopulation of prototypes in zones with high density of data. In order to tackle this drawback, we introduce some modifications to the original Growing Neural Gas algorithm by adding three new parameters, one of them controlling the distribution of the codebook and the other two controlling the quantization error and the amount of units in the network. The resulting learning algorithm is capable of efficiently quantizing large datasets presenting high and low density regions while solving the prototype proliferation problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2004)
Google Scholar
Bouchachia, A., Gabrys, B., Sahel, Z.: Overview of Some Incremental Learning Algorithms. In: Fuzzy Systems Conference. FUZZ-IEEE 2007, vol. 23-26, pp. 1–6 (2007)
Google Scholar
Bradley, P., Gehrke, J., Ramakrishnan, R., Srikant, R.: Scaling mining algorithms to large databases. Commun. ACM 45, 38–43 (2002)
Article Google Scholar
Cselenyi, Z.: Mapping the dimensionality, density and topology of data: The growing adaptive neural gas. Computer Methods and Programs in Biomedicine 78, 141–156 (2005)
Article Google Scholar
Fritzke, B.: Unsupervised ontogenic networks. In: Handbook of Neural Computation, ch. C 2.4, Institute of Physics, Oxford University Press (1997)
Google Scholar
Fritzke, B.: Goals of Competitive Learning. In: Some Competitive Learning Methods (1997), http://www.neuroinformatik.rub.de/VDM/research/gsn/JavaPaper/ (Cited October 26, 2008)
Fritzke, B.: A Growing Neural Gas Learns Topologies. In: Advances in Neural Information Processing Systems, vol. 7. MIT Press, Cambridge (1995)
Google Scholar
Ganti, V., Gehrke, J., Ramakrishnan, R.: Mining very large databases. Computer 32, 38–45 (1999)
Article Google Scholar
Giraud-Carrier, C.: A note on the utility of incremental learning. AI Commun. 13, 215–223 (2000)
MATH Google Scholar
Heinke, D., Hamker, F.H.: Comparing neural networks: a benchmark on growing neural gas, growing cell structures, and fuzzy ARTMAP. IEEE Transactions on Neural Networks 9, 1279–1291 (1998)
Article Google Scholar
Hijmans, R., Cameron, S., Parra, J., Jones, P., Jarvis, A.: Very High Resolution Interpolated Climate Surfaces for Global Land Areas. Int. J. Climatol 25, 1965–1978 (2005)
Article Google Scholar
Martinetz, T., Schulten, K.: Topology representing networks. Neural Networks 7, 507–522 (1994)
Article Google Scholar
Martinetz, T., Schulten, K.: A neural gas network learns topologies. Artificial Neural Networks, 397–402 (1991)
Google Scholar

Download references

Author information

Authors and Affiliations

Institut des Systèmes d’Information (ISI), Hautes Etudes Commerciales (HEC), Université de Lausanne, Switzerland
Héctor F. Satizábal & Marco Tomassini
Reconfigurable and Embeded Digital Systems (REDS), University of Applied Sciences of Western Switzerland (HES-SO) (HEIG-VD),
Andres Pérez-Uribe

Authors

Héctor F. Satizábal
View author publications
You can also search for this author in PubMed Google Scholar
Andres Pérez-Uribe
View author publications
You can also search for this author in PubMed Google Scholar
Marco Tomassini
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science, University of Malaga , Campus de Teatinos S/N, 29071, Málaga, Spain
Leonardo Franco & José M. Jerez &
Centre for Computational Intelligence, School of Computing, De Montfort University, The Gateway , LE1 9BH, Leicester, UK
David A. Elizondo

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Satizábal, H.F., Pérez-Uribe, A., Tomassini, M. (2009). Avoiding Prototype Proliferation in Incremental Vector Quantization of Large Heterogeneous Datasets. In: Franco, L., Elizondo, D.A., Jerez, J.M. (eds) Constructive Neural Networks. Studies in Computational Intelligence, vol 258. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04512-7_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-04512-7_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04511-0
Online ISBN: 978-3-642-04512-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics