Skip to main content
Log in

Speeding up the Self-Organizing Feature Map Using Dynamic Subset Selection

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

An active learning algorithm is devised for training Self-Organizing Feature Maps on large data sets. Active learning algorithms recognize that not all exemplars are created equal. Thus, the concepts of exemplar age and difficulty are used to filter the original data set such that training epochs are only conducted over a small subset of the original data set. The ensuing Hierarchical Dynamic Subset Selection algorithm introduces definitions for exemplar difficulty suitable to an unsupervised learning context and therefore appropriate Self-organizing map (SOM) stopping criteria. The algorithm is benchmarked on several real world data sets with training set exemplar counts in the region of 30–500 thousand. Cluster accuracy is demonstrated to be at least as good as that from the original SOM algorithm while requiring a fraction of the computational overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Blake C.L., Merz C. J. UCI Repository of machine learning databases (1998).

  • Curry R., Heywood M.I. (2004). Towards Efficient Training on Large Datasets for Genetic Programming. Proceedings of the 17th Conference of the Canadian Society for Computational Studies of Intelligence. Lecture Notes in Artificial Intelligence . Vol. 3060. Springer-Verlag, Berlin. 161–174.

  • A. Eskin A. Arnold M. Prerau L. Portnoy S. Stolfo (2002) A Geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data. In Applications of Data Mining in Computer Science. Chapter 4 Kluwer Academic Boston, MA

    Google Scholar 

  • Gathercole C., Ross P. (1994). Dynamic Training subset selection for supervised learning in genetic programming, parallel problem solving from Nature III. Lecture Notes in Computer Science. Vol. 866. Springer-Verlag, Berlin. 312–321.

  • U. Halici G. Ongun (1996) ArticleTitleFingerprint classification through self-organizing feature maps modified to treat uncertainties Proceedings of the IEEE 84 IssueID10 1497–1512 Occurrence Handle10.1109/5.537114

    Article  Google Scholar 

  • S. Haykin (1999) Neural Networks – A Comprehensive Foundation (2nd Edition) Prentice-Hall New Jersey

    Google Scholar 

  • Kayacik H.G. (2003). Hierarchical Self Organizing Map Based IDS on KDD Benchmark Master Thesis. Dalhousie University, Faculty of Computer Science.

  • T. Kohonen (2001) Self-Organizing Maps EditionNumber3 Springer-Verlag Berlin

    Google Scholar 

  • T. Kohonen S. Kaski K. Lagus J. Salojarvi J. Hokela V. Paatero A. Saarela (2000) ArticleTitleSelf Organization of a massive document collection IEEE Transactions on Neural Networks 11 IssueID3 574–585 Occurrence Handle10.1109/72.846729

    Article  Google Scholar 

  • Lee W., Stolfo S., Mok K. (1999). A data mining framework for building intrusion detection models. Proceedings of the 1999 IEEE Symposium on Security and Privacy. 120–132.

  • I. Levin (2000) ArticleTitleKDD-99 Classifier Learning context: llsoft’s results overview ACM SIGKDD Explorations 1 IssueID2 67–75

    Google Scholar 

  • N. Lightowler C.T. Spracklen A.R. Allen (1997) ArticleTitleAn introduction to modular map systems IEE Colloquium on Neural and Fuzzy Systems: Design. Hardware and Applications 97 IssueID133 3/1–3/4

    Google Scholar 

  • B. Pfahringer (2000) ArticleTitleWinning the KDD99 Classification cup: bagged boosting ACM SIGKDD Explorations 1 IssueID2 65–66

    Google Scholar 

  • Song D., Heywood M.I., Zincir-Heywood A.N. (2003). A Linear genetic programming approach to intrusion detection. Proceedings of the Genetic and Evolutionary Computation Conference. Lecture Notes in Computer Science. Vol. 2724 Springer-Verlag, Berlin, 2325–2336.

  • M.-C. Su H.-T. Chang (2000) ArticleTitleFast Self-Organizing Feature Map Algorithm IEEE Transactions on Neural Networks 11 IssueID3 721–733 Occurrence Handle10.1109/72.846743

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Malcolm I. Heywood.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wetmore, L., Heywood, M.I. & Zincir-Heywood, A.N. Speeding up the Self-Organizing Feature Map Using Dynamic Subset Selection. Neural Process Lett 22, 17–32 (2005). https://doi.org/10.1007/s11063-004-7775-6

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-004-7775-6

Keywords

Navigation