Skip to main content
Log in

TCSOM: Clustering Transactions Using Self-Organizing Map

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Self-Organizing Map (SOM) networks have been successfully applied as a clustering method to numeric datasets. However, it is not feasible to directly apply SOM for clustering transactional data. This paper proposes the Transactions Clustering using SOM (TCSOM) algorithm for clustering binary transactional data. In the TCSOM algorithm, a normalized Dot Product norm based dissimilarity measure is utilized for measuring the distance between input vector and output neuron. And a modified weight adaptation function is employed for adjusting weights of the winner and its neighbors. More importantly, TCSOM is a one-pass algorithm, which is extremely suitable for data mining applications. Experimental results on real datasets show that TCSOM algorithm is superior to those state-of-the-art transactional data clustering algorithms with respect to clustering accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Han, E. H., Karypis, G., Kumar, V. and Mobasher, B. Clustering based on association rule hypergraphs. In: SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, pp. 9–13, 1997.

  • Gibson, D., Kleiberg, J. and Raghavan, P. Clustering categorical data: an approach based on dynamic systems. In: Proceedings of VLDB’98, pp. 311–323, 1998.

  • Zhang, Y., Fu, A. W., Cai, C. H. and Heng, P. A. Clustering categorical data. In: Proceedings of ICDE’00, pp. 305–305, 2000.

  • Huang, Z. A fast clustering algorithm to cluster very large categorical data sets in data mining. In: SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, pp. 1–8, 1997.

  • Z. Huang (1998) ArticleTitleExtensions to the k-means algorithm for clustering large data sets with categorical values Data Mining and Knowledge Discovery 2 IssueID3 283–304 Occurrence Handle10.1023/A:1009769707641

    Article  Google Scholar 

  • Jollois, F. and Nadif, M. Clustering large categorical data. In: Proceedings of PAKDD’02, pp. 257–263, 2002.

  • Z. Huang M.K. Ng (1999) ArticleTitleA fuzzy k-modes algorithm for clustering categorical data IEEE Transaction on Fuzzy Systems 7 IssueID4 446–452

    Google Scholar 

  • M.K. Ng J.C. Wong (2002) ArticleTitleClustering categorical data sets using tabu search techniques Pattern Recognition 35 IssueID12 2783–2790 Occurrence Handle10.1016/S0031-3203(02)00021-3

    Article  Google Scholar 

  • Y. Sun Q. Zhu Z. Chen (2002) ArticleTitleAn iterative initial-points refinement algorithm for categorical data clustering Pattern Recognition Letters 23 IssueID7 875–884 Occurrence Handle10.1016/S0167-8655(01)00163-5

    Article  Google Scholar 

  • Ganti, V., Gehrke, J. and Ramakrishnan, R. CACTUS-clustering categorical data using summaries. In: Proceedings of KDD’99, pp. 73–83, 1999.

  • Guha, S., Rastogi, R. and Shim, K. ROCK: a robust clustering algorithm for categorical attributes. In: Proceedings of ICDE’99, pp 512–521, 1999.

  • Wang, K., Xu, C. and Liu, B. Clustering transactions using large items. In: Proceedings of CIKM’99, pp. 483–490, 1999.

  • Yun, C. H., Chuang, K. T. and Chen, M. S. An efficient clustering algorithm for market basket data based on small large ratios. In: Proceedings of COMPSAC’01, pp. 505–510, 2001.

  • Yun, C. H., Chuang, K. T. and Chen, M. S. Using category based adherence to cluster market-basket data. In: Proceedings of ICDM’02, pp. 546–553, 2002.

  • Xu, J. and Sung, S. Y. Caucus-based transaction clustering. In: Proceedings of DASFAA’03, pp. 81–88, 2003.

  • Z. He X. Xu S. Deng (2002) ArticleTitleSqueezer: an efficient algorithm for clustering categorical data Journal of Computer Science & Technology 17 IssueID5 611–624 Occurrence Handle1929400

    MathSciNet  Google Scholar 

  • Barbara, D., Li, Y. and Couto, J. COOLCAT: an entropy-based algorithm for categorical clustering. In: Proceedings of CIKM’02, pp 582–589, 2002.

  • Yang, Y., Guan, S. and You, J. CLOPE: a fast and effective clustering algorithm for transactional data. In: Proceedings of KDD’02, pp. 682–687, 2002.

  • Giannotti, F., Gozzi, G. and Manco, G. Clustering transactional data. In: Proceedings of PKDD’02, pp. 175–187, 2002.

  • D. Cristofor D. Simovici (2002) ArticleTitleFinding median partitions using information-theoretical-based genetic algorithms Journal of Universal Computer Science 8 IssueID2 153–172 Occurrence Handle1895795

    MathSciNet  Google Scholar 

  • Z. He X. Xu S. Deng (2005) ArticleTitleA cluster ensemble method for clustering categorical data Information Fusion 6 IssueID2 143–151 Occurrence Handle10.1016/j.inffus.2004.03.001

    Article  Google Scholar 

  • Ordonez, C. Clustering Binary Data Streams with K-means. In: SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, 2003.

  • T. Kohonen (1984) Self-organization and associative memory EditionNumber2 Springer-Verlag Berlin

    Google Scholar 

  • A. Flexer (2001) ArticleTitleOn the use of self-organizing maps for clustering and visualization Intelligent Data Analysis 5 IssueID5 373–384 Occurrence Handle02101276

    MATH  Google Scholar 

  • Shum, W-H., Jin, H., Leung, K-S. and Wong, M. L. A Self-Organizing Map with Expanding Force for Data Clustering and Visualization. In: Proceedings of ICDM’02, pp. 434–441, 2002.

  • Domingos, P. and Hulton, G. Catching up with the data: research issues in mining data streams. In: 2001 SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 2001.

  • Merz, C. J. and Merphy, P. UCI Repository of Machine Learning Databases, 1996. (http://www.ics.uci.edu/∼mlearn/MLRRepository.html).

  • M.F. Jiang S.S. Tseng C.M. Su (2001) ArticleTitleTwo-phase clustering process for outliers detection Pattern Recognition Letters 22 IssueID6–7 691–700

    Google Scholar 

  • Z. He X. Xu S. Deng (2003) ArticleTitleDiscovering cluster based local outliers Pattern Recognition Letters 24 IssueID9–10 1641–1650

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zengyou He.

Rights and permissions

Reprints and permissions

About this article

Cite this article

He, Z., Xu, X. & Deng, S. TCSOM: Clustering Transactions Using Self-Organizing Map. Neural Process Lett 22, 249–262 (2005). https://doi.org/10.1007/s11063-005-8016-3

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-005-8016-3

Keywords

Navigation