Stock Time Series Categorization and Clustering Via SB-Tree Optimization

Fu, Tak-chung; Law, Chi-wai; Chan, Kin-kee; Chung, Fu-lai; Ng, Chak-man

doi:10.1007/11881599_141

Tak-chung Fu^23,24,
Chi-wai Law²³,
Kin-kee Chan²³,
Fu-lai Chung²³ &
…
Chak-man Ng²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4223))

Included in the following conference series:

International Conference on Fuzzy Systems and Knowledge Discovery

1738 Accesses

Abstract

SB-Tree is a data structure proposed to represent time series according to the importance of the data points. Its advantages over traditional time series representation approaches include: representing time series directly in time domain (shape preservation), retrieving time series data according to the importance of the data points and facilitating multi-resolution time series retrieval. Based on these benefits, one may find this representation particularly attractive in financial time series domain and the corresponding data mining tasks, i.e. categorization and clustering. In this paper, an investigation on the size of the SB-Tree is reported. Two SB-Tree optimization approaches are proposed to reduce the size of the SB-Tree while the overall shape of the time series can be preserved. As demonstrated by various experiments, the proposed approach is suitable for different categorization and clustering applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Detecting patterns in financial data through weighted time-frequency domain clustering

Article 20 November 2024

Time series clustering in linear time complexity

Article 18 September 2021

Extreme-SAX: Extreme Points Based Symbolic Representation for Time Series Classification

References

Fu, T.C., Chung, F.L., Luk, R., Ng, C.M.: A specialized binary tree for financial time series representation. In: The 10th ACM SIGKDD Workshop on Temporal Data Mining, pp. 96–104 (2004)
Google Scholar
Geurts, P.: Pattern Extraction for Time Series Classification. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 115–127. Springer, Heidelberg (2001)
Chapter Google Scholar
Smyth, P., Keogh, E.: Clustering and mode classification of engineering time series data. In: Proc. of the 3rd Int.l Conf. on KDD, pp. 24–30 (1997)
Google Scholar
Keogh, E., Pazzani, M.: An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback. In: Proc. of the 4th Int. Conf. on KDD, pp. 239–341 (1998)
Google Scholar
Abonyi, J., Feil, B., Nemeth, S., Arva, P.: Principal component analysis based time se-ries segmentation - Application to hierarchical clustering for multivariate process data. In: Proc, of the IEEE Int. Conf. on Computational Cybernetics, pp. 29–31 (2003)
Google Scholar
Lin, J., Vlachos, M., Keogh, E., Gunopulos, D.: Iterative Incremental Clustering of Time Series. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 106–122. Springer, Heidelberg (2004)
Chapter Google Scholar
Ratanamahatana, C.A., Keogh, E., Bagnall, A.J., Lonardi, S.: A novel bit level time se-ries representation with implications for similarity search and clustering. Technical Report, UCR, TR-2004-93 (2004)
Google Scholar
Xiong, Y., Yeung, D.Y.: Mixtures of ARMA models for model-based time series clustering. In: Proc. of ICDM, pp. 717–720 (2002)
Google Scholar
Kalpakis, K., Gada, D., Puttagunta, V.: Distance measures for effective clustering of ARIMA time-series. In: Proc. of ICDM, pp. 273–280 (2001)
Google Scholar
Chung, F.L., Fu, T.C., Luk, R., Ng, V.: Flexible Time Series Pattern Matching Based on Perceptually Important Points. In: International Joint Conference on Artificial Intelligence Workshop on Learning from Temporal and Spatial Data, pp. 1–7 (2001)
Google Scholar
Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. JKIS, 263–286 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing, The Hong Kong Polytechnic University, Hong Kong
Tak-chung Fu, Chi-wai Law, Kin-kee Chan & Fu-lai Chung
Department of Computing and Information Management, Hong Kong Institute of Vocational Education (Chai Wan), Hong Kong
Tak-chung Fu & Chak-man Ng

Authors

Tak-chung Fu
View author publications
You can also search for this author in PubMed Google Scholar
Chi-wai Law
View author publications
You can also search for this author in PubMed Google Scholar
Kin-kee Chan
View author publications
You can also search for this author in PubMed Google Scholar
Fu-lai Chung
View author publications
You can also search for this author in PubMed Google Scholar
Chak-man Ng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Electrical and Electronic Engineering, Nanyang Technological University,, Block S1, Nanyang Avenue, 639798, Singapore
Lipo Wang
Life Science Research Center, School of Electronic Engineering, Xidian University,, 710071, Xi’an, Shaanxi, China
Licheng Jiao
School of Electrical and Electronic Engineering, Xidian University, 710071, Xi’an, China
Guanming Shi
School of Information Technology and Electrical Engineering, The University of Queensland, 4072, Brisbane, Queensland, Australia
Xue Li
College of Mathematics and Information Science, Hebei Normal University, 050016, Shijiazhuang, Hebei, P.R. China
Jing Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fu, Tc., Law, Cw., Chan, Kk., Chung, Fl., Ng, Cm. (2006). Stock Time Series Categorization and Clustering Via SB-Tree Optimization. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds) Fuzzy Systems and Knowledge Discovery. FSKD 2006. Lecture Notes in Computer Science(), vol 4223. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11881599_141

Download citation

DOI: https://doi.org/10.1007/11881599_141
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45916-3
Online ISBN: 978-3-540-45917-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics