Skip to main content

Decision Tree Clustering for Time Series Data: An Approach for Enhanced Interpretability and Efficiency

  • Conference paper
  • First Online:
PRICAI 2023: Trends in Artificial Intelligence (PRICAI 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14326))

Included in the following conference series:

  • 541 Accesses

Abstract

Clustering is one of the unsupervised learning methods for grouping similar data samples. While clustering has been used in a wide range, traditional clustering methods cannot provide clear interpretations of the resulting clusters. This has led to an increasing interest in interpretable clustering methods, which are mainly based on decision trees. However, the existing interpretable clustering methods are typically designed for tabular data and struggle when applied to time series data due to its complex nature. In this paper, we propose a novel interpretable time-series clustering method with decision trees. To address the interpretability challenges in time-series data, our method employs two separate feature sets, intuitive features for decision tree branching and original time-series observed values for evaluating a given clustering metric. This dual use enables us to construct interpretable clustering trees for time series data. In addition, to handle datasets with a large number of samples, we propose a new metric for evaluating clustering quality, called the surrogate silhouette coefficient, and present a heuristic algorithm for constructing a decision tree based on the metric. We show that the computational complexity for evaluating the proposed metric is much less than the silhouette coefficient, which is commonly used in decision tree-based clustering. Our numerical experiments demonstrated that our method constructed decision trees faster than the existing methods based on the silhouette coefficient while maintaining clustering quality. In addition, we applied our method to a time-series data on an e-commerce platform and succeeded in constructing an insightful decision tree.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    All code and scripts for our method are available at https://github.com/tokyotech-nakatalab/interpretable_time-series_clustering.

References

  1. Basak, J., Krishnapuram, R.: Interpretable hierarchical clustering by constructing an unsupervised decision tree. IEEE Trans. Knowl. Data Eng. 17, 121–132 (2005)

    Article  Google Scholar 

  2. Bertsimas, D., Orfanoudaki, A., Wiberg, H.: Interpretable clustering: an optimization approach. Mach. Learn. 110, 89–138 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  3. Blockeel, H., De Raedt, L., Ramon, J.: Top-down induction of clustering trees. In: Proceedings of the 15th International Conference on Machine Learning, pp. 55–63 (1998)

    Google Scholar 

  4. Caron, M., Bojanowski, P., Joulin, A., Douze, M.: Deep clustering for unsupervised learning of visual features. In: Proceedings of the European Conference on Computer Vision, pp. 132–149 (2018)

    Google Scholar 

  5. Chang, E., Shen, X., Yeh, H-S., Demberg, V.: On training instance selection for few-shot neural text generation. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 9707–9718 (2021)

    Google Scholar 

  6. Coleman, G.B., Andrews, H.C.: Image segmentation by clustering. Proc. IEEE 67, 773–785 (1979)

    Article  Google Scholar 

  7. Dasgupta, S., Frost, N., Moshkovitz, M., Rashtchian, C.: Explainable k-means and k-medians clustering. In: Proceedings of the 37th International Conference on Machine Learning, pp. 7055–7065 (2020)

    Google Scholar 

  8. De Raedt, L., Blockeel, H.: Using logical decision trees for clustering. In: International Conference on Inductive Logic Programming, pp. 133–140 (1997)

    Google Scholar 

  9. Fraiman, R., Ghattas, B., Svarc, M.: Interpretable clustering using unsupervised binary trees. Adv. Data Anal. Classif. 7, 125–145 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  10. Ghattas, B., Michel, P., Boyer, L.: Clustering nominal data using unsupervised binary decision trees: comparisons with the state of the art methods. Pattern Recogn. 67, 177–185 (2017)

    Article  Google Scholar 

  11. Joe, J., Ward, J., Jr.: Hierarchical grouping to optimize an objective function. Am. Stat. Assoc. 58, 236–244 (1963)

    Article  MathSciNet  Google Scholar 

  12. Kim, K., Ahn, H.: A recommender system using GA k-means clustering in an online shopping market. Expert Syst. Appl. 34, 1200–1209 (2008)

    Article  Google Scholar 

  13. Liu, B., Yiyuan, X., Philip, S, Y.: Clustering through decision tree construction. In: Proceedings of the Ninth Conference on Information and Knowledge Management, pp. 20–29 (2000)

    Google Scholar 

  14. Lux, T., Marchesi, M.: Volatility clustering in financial markets: a micro-simulation of interacting agents. Int. J. Theor. Appl. Finan. 3, 675–702 (2000)

    Article  MATH  Google Scholar 

  15. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)

    Google Scholar 

  16. Onnela, J.-P., Kaski, K., Kertész, J.: Clustering and information in correlation based financial networks. Eur. Phys. J. B 38(2), 353–362 (2004). https://doi.org/10.1140/epjb/e2004-00128-7

    Article  Google Scholar 

  17. Punj, G., Stewart, D.W.: Cluster analysis in marketing research: review and suggestions for application. J. Mark. Res. 20(2), 134–148 (1983)

    Article  Google Scholar 

  18. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1, 81–106 (1986)

    Article  Google Scholar 

  19. Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)

    Article  MATH  Google Scholar 

  20. Saisubramanian, S., Galhotra, S., Zilberstein, S.: Balancing the tradeoff between clustering value and interpretability. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pp. 351–357 (2020)

    Google Scholar 

  21. Yang, H., Jiao, L., Pan, Q.: A survey on interpretable clustering. In: Proceedings of 40th Chinese Control Conference, pp. 7384–7388 (2021)

    Google Scholar 

  22. Yoon, S., Dernoncourt, F., Kim, D., Bui, T., Jung, K.: A compare-aggregate model with latent clustering for answer selection. In: Proceedings of the 28th International Conference on Information and Knowledge Management, pp. 2093–2096 (2019)

    Google Scholar 

Download references

Acknowledgments

This study was conducted as a part of the Data Analysis Competition hosted by Joint Association Study Group of Management Science. The authors would like to thank the organizers and Rakuten Group, Inc. for providing us with a real data set.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Masaki Higashi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Higashi, M. et al. (2024). Decision Tree Clustering for Time Series Data: An Approach for Enhanced Interpretability and Efficiency. In: Liu, F., Sadanandan, A.A., Pham, D.N., Mursanto, P., Lukose, D. (eds) PRICAI 2023: Trends in Artificial Intelligence. PRICAI 2023. Lecture Notes in Computer Science(), vol 14326. Springer, Singapore. https://doi.org/10.1007/978-981-99-7022-3_42

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7022-3_42

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7021-6

  • Online ISBN: 978-981-99-7022-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics