Skip to main content

Structured Spectral Clustering of PurTree Data

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11447))

Included in the following conference series:

  • 2820 Accesses

Abstract

Recently, a “Purchase Tree” data structure is proposed to compress the customer transaction data and a local PurTree Spectral clustering method is proposed to recover the cluster structure from the purchase trees. However, in the PurTree distance, the node weights for the children nodes of a parent node are set as equal and the difference between different nodes are not distinguished. In this paper, we propose a Structured PurTree Subspace Spectral (SPSS) clustering algorithm for PurTree Data. In the new method, we propose a PurTree subspace similarity to compute the similarity between two trees, in which a set of sparse and structured node weights are introduced to distinguish the importance of different nodes in a purchase tree. A new clustering model is proposed to learn a structured graph with explicit cluster structure. An iterative optimization algorithm is proposed to simultaneously learn the structured graph and node weights. We propose a balanced cover tree for fast k-NN searching during building affinity matrices. SPSS was compared with six clustering algorithms on 10 benchmark data sets and the experimental results show the superiority of the new method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://community.tableau.com/docs/DOC-1236.

  2. 2.

    http://www.kaggle.com/c/acquire-valued-shoppers-challenge/data.

References

  1. Beygelzimer, A., Kakade, S., Langford, J.: Cover trees for nearest neighbor. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 97–104. ACM (2006)

    Google Scholar 

  2. Chen, X., Fang, Y., Yang, M., Nie, F., Zhao, Z., Huang, J.Z.: PurTreeClust: a clustering algorithm for customer segmentation from massive customer transaction data. IEEE Trans. Knowl. Data Eng. 30(3), 559–572 (2018). https://doi.org/10.1109/TKDE.2017.2763620

    Article  Google Scholar 

  3. Chen, X., Huang, J.Z., Luo, J.: PurTreeClust: a purchase tree clustering algorithm for large-scale customer transaction data. In: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp. 661–672, May 2016. https://doi.org/10.1109/ICDE.2016.7498279

  4. Chen, X., Peng, S., Huang, J.Z., Nie, F., Ming, Y.: Local PurTree spectral clustering for massive customer transaction data. IEEE Intell. Syst. 32(2), 37–44 (2017)

    Article  Google Scholar 

  5. Hagen, L., Kahng, A.B.: New spectral methods for ratio cut partitioning and clustering. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 11(9), 1074–1085 (1992)

    Article  Google Scholar 

  6. Kuo, R., Ho, L., Hu, C.M.: Integration of self-organizing feature map and k-means algorithm for market segmentation. Comput. Oper. Res. 29(11), 1475–1493 (2002)

    Article  Google Scholar 

  7. Lu, T.C., Wu, K.Y.: A transaction pattern analysis system based on neural network. Expert Syst. Appl. 36(3), 6091–6099 (2009)

    Article  Google Scholar 

  8. Ng, A.Y., Jordan, M.I., Weiss, Y., et al.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems 2, pp. 849–856 (2002)

    Google Scholar 

  9. Ngai, E.W., Xiu, L., Chau, D.C.: Application of data mining techniques in customer relationship management: a literature review and classification. Expert Syst. Appl. 36(2), 2592–2602 (2009)

    Article  Google Scholar 

  10. Nie, F., Wang, X., Huang, H.: Clustering and projected clustering with adaptive neighbors. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 977–986. ACM (2014)

    Google Scholar 

  11. Nie, F., Wang, X., Jordan, M.I., Huang, H.: The constrained Laplacian rank algorithm for graph-based clustering. In: Thirtieth AAAI Conference on Artificial Intelligence, pp. 1969–1976 (2016)

    Google Scholar 

  12. Tsai, C.Y., Chiu, C.C.: A purchase-based market segmentation methodology. Expert Syst. Appl. 27(2), 265–276 (2004)

    Article  Google Scholar 

  13. Wang, K., Xu, C., Liu, B.: Clustering transactions using large items. In: Proceedings of the Eighth International Conference on Information and Knowledge Management, pp. 483–490. ACM (1999)

    Google Scholar 

  14. Wang, W., Carreira-Perpián, M.Á.: Projection onto the probability simplex: an efficient algorithm with a simple proof, and an application. Mathematics (2013)

    Google Scholar 

  15. Xiao, Y., Dunham, M.H.: Interactive clustering for transaction data. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2001. LNCS, vol. 2114, pp. 121–130. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44801-2_13

    Chapter  Google Scholar 

  16. Xiong, T., Wang, S., Mayers, A., Monga, E.: DHCC: Divisive hierarchical clustering of categorical data. Data Mining Knowl. Discovery 24(1), 103–135 (2012)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgment

This research was supported by the National Key R&D Program of China 2018YFB1003201, NSFC under Grant no. 61773268, 61502177 and U1636202, and Guangdong Key Laboratory project 2017B030314073.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaojun Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, X., Guo, C., Fang, Y., Mao, R. (2019). Structured Spectral Clustering of PurTree Data. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds) Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science(), vol 11447. Springer, Cham. https://doi.org/10.1007/978-3-030-18579-4_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-18579-4_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-18578-7

  • Online ISBN: 978-3-030-18579-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics