A hidden semi-Markov model for chart pattern matching in financial time series

Wan, Yuqing; Si, Yain-Whar

doi:10.1007/s00500-017-2703-7

A hidden semi-Markov model for chart pattern matching in financial time series

Methodologies and Application
Published: 10 July 2017

Volume 22, pages 6525–6544, (2018)
Cite this article

Soft Computing Aims and scope Submit manuscript

467 Accesses
8 Citations
Explore all metrics

Abstract

Many pattern matching approaches have been applied in financial time series to detect chart patterns and predict price trends. In this paper, we propose an extended hidden semi-Markov model for chart pattern matching (HSMM-CP). In our approach, a hidden semi-Markov model is trained and a Viterbi algorithm is used to detect chart patterns. The proposed approach not only simplifies the traditional way of training an HSMM, but also reduces potential biases in parameter initialisation. We compare the proposed model with current approaches on a set of templates selected from 53 chart patterns. Experiments on a synthetic dataset show that the proposed approach has the highest average accuracy and recall among other pattern matching approaches. Specifically, the HSMM-CP approach achieves highest accuracy for “Triangles, Ascending”, “Head-and-Shoulders Tops”, “Triple Tops” and “Cup with Handle” patterns. Moreover, experiments results show that the HSMM-CP performs significantly better than other approaches in distinguishing patterns with similar shapes such as “Head-and-Shoulders Tops” and “Triple Tops”. Experiments are also conducted on a real dataset comprising the historical prices of several stocks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature extraction for chart pattern classification in financial time series

Article 07 May 2021

Discovering sub-patterns from time series using a normalized cross-match algorithm

Article 04 February 2016

HIME: discovering variable-length motifs in large-scale time series

Article 08 December 2018

References

Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: KDD workshop, vol. 10, Seattle, WA, pp. 359–370
Bulkowski TN (2011) Encyclopedia of chart patterns, 2nd edn. Wiley, Hoboken, New Jersey
Google Scholar
Cao H, Jin H, Wu S, Ibrahim S (2013) Petri net based grid workflow verification and optimization. J Supercomput 66(3):1215–1230
Article Google Scholar
Chang CC, Lin CJ (2011) LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol 2(3):27
Article Google Scholar
Chen CH, Tseng VS, Yu HH, Hong TP (2013) Time series pattern discovery by a PIP-based evolutionary approach. Soft Comput 17(9):1699–1710
Article Google Scholar
Chung FL, Fu TC, Luk R, Ng V (2001) Flexible time series pattern matching based on perceptually important points. In: International joint conference on artificial intelligence workshop on learning from temporal and spatial data, pp. 1–7
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Series B (methodol) 39(1):1–38
MathSciNet MATH Google Scholar
Fu Tc, Chung Fl, Luk R, Ng Cm (2007) Stock time series pattern matching: template-based vs. rule-based approaches. Eng Appl Artif Intell 20(3):347–364
Article Google Scholar
Ge X, Smyth P (2000) Deformable Markov model templates for time-series pattern matching. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp. 81–90
Gu B, Sheng VS (2016) A robust regularization path algorithm for \(\nu \)-support vector classification. IEEE Trans Neural Netw Learn Syst 28(5):1241–1248
Article Google Scholar
Gu B, Sheng VS, Tay KY, Romano W, Li S (2015) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416
Article MathSciNet Google Scholar
Gu B, Sun X, Sheng VS (2016) Structural minimax probability machine. IEEE Trans Neural Netw Learn Syst 28(7):1646–1656
Holmes WJ, Russell MJ (1999) Probabilistic-trajectory segmental HMMs. Comput Speech Lang 13(1):3–37
Article Google Scholar
Keogh E, Chu S, Hart D, Pazzani M (2001) An online algorithm for segmenting time series. In: Data mining, 2001. ICDM 2001, Proceedings IEEE international conference on, IEEE, pp. 289–296
Keogh EJ, Pazzani MJ (2000) A simple dimensionality reduction technique for fast similarity search in large time series databases. In: Knowledge discovery and data mining. Current issues and new applications, Springer, pp. 122–133
Kim S, Smyth P (2006) Segmental hidden Markov models with random effects for waveform modeling. J Mach Learn Res 7:945–969
MathSciNet MATH Google Scholar
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proce IEEE 77(2):257–286
Article Google Scholar
Schölkopf B, Smola AJ, Williamson RC, Bartlett PL (2000) New support vector algorithms. Neural Comput 12(5):1207–1245
Article Google Scholar
Si YW, Yin J (2013) OBST-based segmentation approach to financial time series. Eng Appl Artif Intell 26(10):2581–2596
Article Google Scholar
Wan Y, Gong X, Si YW (2016) Effect of segmentation on financial time series pattern matching. Appl Soft Comput 38:346–359
Article Google Scholar
Wen X, Shao L, Xue Y, Fang W (2015) A rapid learning algorithm for vehicle classification. Inf Sci 295:395–406
Article Google Scholar
Xia Z, Wang X, Sun X, Liu Q, Xiong N (2016) Steganalysis of lsb matching using differences between nonadjacent pixels. Multimed Tools Appl 75(4):1947–1962
Article Google Scholar
Xing Z, Pei J, Keogh E (2010) A brief survey on sequence classification. ACM Sigkdd Explor Newsl 12(1):40–48
Article Google Scholar
Yu SZ (2010) Hidden semi-Markov models. Artif Intell 174(2):215–243
Article MathSciNet MATH Google Scholar
Zapranis A, Samolada E (2007) Can neural networks learn the “Head and Shoulders” technical analysis price pattern? Towards a methodology for testing the efficient market hypothesis. In: Artificial neural networks–ICANN 2007, Springer, pp. 516–526
Zhang Z, Jiang J, Liu X, Lau R, Wang H, Zhang R (2010) A real time hybrid pattern matching scheme for stock time series. In: Proceedings of the twenty-first Australasian conference on database technologies-vol 104, Australian Computer Society, Inc, pp. 161–170
Zheng Y, Jeon B, Xu D, Wu Q, Zhang H (2015) Image segmentation by generalized hierarchical fuzzy c-means algorithm. J Intell Fuzzy Syst 28(2):961–973
Google Scholar

Download references

Acknowledgements

This research was funded by the Research Committee of University of Macau, Grant MYRG2017-00029-FST and MYRG2016-00148-FST.

Author information

Authors and Affiliations

Department of Computer and Information Science, University of Macau, Macau, China
Yuqing Wan & Yain-Whar Si

Authors

Yuqing Wan
View author publications
You can also search for this author in PubMed Google Scholar
Yain-Whar Si
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yain-Whar Si.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by V. Loia.

Appendices

Appendix A: How to set thresholds for the TB, ED and DTW approaches

We use the H&S-T pattern as an example to illustrate how we set the thresholds in the experiment. We began by generating four datasets containing one hundred time series (the top fifty were H&S-T positive time series and the bottom fifty were randomly generated negative time series) with different lengths of 19, 43, 85 and 127. We used the TB, ED and DTW approaches, respectively, to calculate the similarities between each time series in the four datasets. The top 50 were positive cases, and the distances had to be smaller than those in the bottom 50. As shown in Fig. 14a–d, the TB approach had a fixed threshold as the length of the time series increased. The threshold of the H&S-T pattern for TB \(\theta =0.1\). As shown in Fig. 15a–d, the threshold of the ED approach increased with the length of the time series. As shown in Fig 16a–d, the threshold of the DTW approach increased with the length of the time series. We modelled the threshold of the ED and DTW approach by a linear function of length. In Fig. 15a–d, the thresholds for the ED approach are 20, 40, 85 and 143 for lengths of 19, 43, 85 and 127, respectively. We regressed the threshold as a linear function of length, where the slope \(\alpha =1.1417\) and the intercept \(\beta =-6.2079\). For the DTW approach, in Fig 16a–d, the thresholds 6, 10, 22 and 34 and correspond to lengths of 19, 43, 85 and 127. In the regression linear function of length, the slope \(\gamma =0.2649\) and the intercept \(\varepsilon =-0.1457\).

Appendix B: Experimental settings for a synthetic dataset

The experiment settings are shown in Table 16

Table 16 In the experiment conducted to distinguish H&S and Trip-T, the setting was (Distinguish, 100, 115, 7, 0.1 1.1417, -6.2079, 0.2649, -0.1457). Distinguish was a dataset containing 50 H&S-T and 50 Trip-T time series. As we designed the H&S-T patterns as positive cases, the threshold settings for the TB, ED and DTW approaches matched those in the H&S-T pattern recognition experiment

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wan, Y., Si, YW. A hidden semi-Markov model for chart pattern matching in financial time series. Soft Comput 22, 6525–6544 (2018). https://doi.org/10.1007/s00500-017-2703-7

Download citation

Published: 10 July 2017
Issue Date: October 2018
DOI: https://doi.org/10.1007/s00500-017-2703-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hidden semi-Markov model for chart pattern matching in financial time series

Abstract

Access this article

Similar content being viewed by others

Feature extraction for chart pattern classification in financial time series

Discovering sub-patterns from time series using a normalized cross-match algorithm

HIME: discovering variable-length motifs in large-scale time series

References

Acknowledgements