Skip to main content

Transaction Clustering Using a Seeds Based Approach

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Included in the following conference series:

  • 1736 Accesses

Abstract

Transaction clustering has received a great deal of attention in the past few years. Its functionality extends well beyond traditional clustering algorithms which basically perform a near-neighbourhood search for locating groups of similar instances. The basic concept underlying transaction clustering stems from the concept of large items as defined by association rule mining algorithms. Clusters formed on the basis of large items that are shared between instances offer an attractive alternative to association rule mining systems. Currently, none of the techniques proposed offer a good solution to scenarios where large items overlap across clusters. In this paper we overcome the aforementioned limitations by using cluster seeds that represent initial centroids. Seeds are generated from sets of transaction items that occur together above a certain threshold and such seeds may overlap in their itemsets across clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)

    Article  Google Scholar 

  2. Ganti, V., Gehrke, J., Ramakrishnan, R.: CACTUS: Clustering categorical data using summaries. In: KDD 1999: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 73–83. ACM Press, New York (1999)

    Chapter  Google Scholar 

  3. Gibson, D., Kleinberg, J.M., Raghavan, P.: Clustering categorical data: An approach based on dynamical systems. In: VLDB 1998: Proceedings of the 24rd International Conference on Very Large Data Bases, pp. 311–322. Morgan Kaufmann Publishers Inc., San Francisco (1998)

    Google Scholar 

  4. Guha, S., Rastogi, R., Shim, K.: ROCK: A robust clustering algorithm for categorical attributes. Information Systems 25(5), 345–366 (2000)

    Article  Google Scholar 

  5. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 207–216 (1993)

    Google Scholar 

  6. Xu, J., Xiong, H., Sung, S.Y., Kumar, V.: A new clustering algorithm for transaction data via caucus. In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS (LNAI), vol. 2637, pp. 551–562. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  7. Wang, K., Xu, C., Liu, B.: Clustering transactions using large items. In: CIKM 1999: Proceedings of the Eighth International Conference on Information and Knowledge Management, pp. 483–490. ACM Press, New York (1999)

    Chapter  Google Scholar 

  8. Cutting, D.R., Pedersen, J.O., Karger, D., Tukey, J.W.: Scatter/gather: A cluster-based approach to browsing large document collections. In: Proceedings of the Fifteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 318–329 (1992)

    Google Scholar 

  9. Yun, C.H., Chuang, K.T., Chen, M.S.: An efficient clustering algorithm for market basket data based on small large ratios. In: COMPSAC 2001: Proceedings of the 25th International Computer Software and Applications Conference on Invigorating Software Development, pp. 505–510. IEEE Computer Society, Washington (2001)

    Google Scholar 

  10. Ivchenko, G.I., Honov, S.A.: On the jaccard similarity test. Journal of Mathematical Sciences 88(6)

    Google Scholar 

  11. Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

  12. Sharma, S.: Applied multivariate techniques. John Wiley & Sons Inc., Chichester (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Koh, Y.S., Pears, R. (2008). Transaction Clustering Using a Seeds Based Approach. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_93

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68125-0_93

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68124-3

  • Online ISBN: 978-3-540-68125-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics