Transaction Clustering Using a Seeds Based Approach

Koh, Yun Sing; Pears, Russel

doi:10.1007/978-3-540-68125-0_93

Yun Sing Koh¹ &
Russel Pears¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1736 Accesses

Abstract

Transaction clustering has received a great deal of attention in the past few years. Its functionality extends well beyond traditional clustering algorithms which basically perform a near-neighbourhood search for locating groups of similar instances. The basic concept underlying transaction clustering stems from the concept of large items as defined by association rule mining algorithms. Clusters formed on the basis of large items that are shared between instances offer an attractive alternative to association rule mining systems. Currently, none of the techniques proposed offer a good solution to scenarios where large items overlap across clusters. In this paper we overcome the aforementioned limitations by using cluster seeds that represent initial centroids. Seeds are generated from sets of transaction items that occur together above a certain threshold and such seeds may overlap in their itemsets across clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)
Article Google Scholar
Ganti, V., Gehrke, J., Ramakrishnan, R.: CACTUS: Clustering categorical data using summaries. In: KDD 1999: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 73–83. ACM Press, New York (1999)
Chapter Google Scholar
Gibson, D., Kleinberg, J.M., Raghavan, P.: Clustering categorical data: An approach based on dynamical systems. In: VLDB 1998: Proceedings of the 24rd International Conference on Very Large Data Bases, pp. 311–322. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Google Scholar
Guha, S., Rastogi, R., Shim, K.: ROCK: A robust clustering algorithm for categorical attributes. Information Systems 25(5), 345–366 (2000)
Article Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 207–216 (1993)
Google Scholar
Xu, J., Xiong, H., Sung, S.Y., Kumar, V.: A new clustering algorithm for transaction data via caucus. In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS (LNAI), vol. 2637, pp. 551–562. Springer, Heidelberg (2003)
Chapter Google Scholar
Wang, K., Xu, C., Liu, B.: Clustering transactions using large items. In: CIKM 1999: Proceedings of the Eighth International Conference on Information and Knowledge Management, pp. 483–490. ACM Press, New York (1999)
Chapter Google Scholar
Cutting, D.R., Pedersen, J.O., Karger, D., Tukey, J.W.: Scatter/gather: A cluster-based approach to browsing large document collections. In: Proceedings of the Fifteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 318–329 (1992)
Google Scholar
Yun, C.H., Chuang, K.T., Chen, M.S.: An efficient clustering algorithm for market basket data based on small large ratios. In: COMPSAC 2001: Proceedings of the 25th International Computer Software and Applications Conference on Invigorating Software Development, pp. 505–510. IEEE Computer Society, Washington (2001)
Google Scholar
Ivchenko, G.I., Honov, S.A.: On the jaccard similarity test. Journal of Mathematical Sciences 88(6)
Google Scholar
Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Sharma, S.: Applied multivariate techniques. John Wiley & Sons Inc., Chichester (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing and Mathematical Sciences, Auckland University of Technology, New Zealand
Yun Sing Koh & Russel Pears

Authors

Yun Sing Koh
View author publications
You can also search for this author in PubMed Google Scholar
Russel Pears
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Koh, Y.S., Pears, R. (2008). Transaction Clustering Using a Seeds Based Approach. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_93

Download citation

DOI: https://doi.org/10.1007/978-3-540-68125-0_93
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68124-3
Online ISBN: 978-3-540-68125-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics