Abstract
Many pattern matching approaches have been applied in financial time series to detect chart patterns and predict price trends. In this paper, we propose an extended hidden semi-Markov model for chart pattern matching (HSMM-CP). In our approach, a hidden semi-Markov model is trained and a Viterbi algorithm is used to detect chart patterns. The proposed approach not only simplifies the traditional way of training an HSMM, but also reduces potential biases in parameter initialisation. We compare the proposed model with current approaches on a set of templates selected from 53 chart patterns. Experiments on a synthetic dataset show that the proposed approach has the highest average accuracy and recall among other pattern matching approaches. Specifically, the HSMM-CP approach achieves highest accuracy for “Triangles, Ascending”, “Head-and-Shoulders Tops”, “Triple Tops” and “Cup with Handle” patterns. Moreover, experiments results show that the HSMM-CP performs significantly better than other approaches in distinguishing patterns with similar shapes such as “Head-and-Shoulders Tops” and “Triple Tops”. Experiments are also conducted on a real dataset comprising the historical prices of several stocks.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: KDD workshop, vol. 10, Seattle, WA, pp. 359–370
Bulkowski TN (2011) Encyclopedia of chart patterns, 2nd edn. Wiley, Hoboken, New Jersey
Cao H, Jin H, Wu S, Ibrahim S (2013) Petri net based grid workflow verification and optimization. J Supercomput 66(3):1215–1230
Chang CC, Lin CJ (2011) LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol 2(3):27
Chen CH, Tseng VS, Yu HH, Hong TP (2013) Time series pattern discovery by a PIP-based evolutionary approach. Soft Comput 17(9):1699–1710
Chung FL, Fu TC, Luk R, Ng V (2001) Flexible time series pattern matching based on perceptually important points. In: International joint conference on artificial intelligence workshop on learning from temporal and spatial data, pp. 1–7
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Series B (methodol) 39(1):1–38
Fu Tc, Chung Fl, Luk R, Ng Cm (2007) Stock time series pattern matching: template-based vs. rule-based approaches. Eng Appl Artif Intell 20(3):347–364
Ge X, Smyth P (2000) Deformable Markov model templates for time-series pattern matching. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp. 81–90
Gu B, Sheng VS (2016) A robust regularization path algorithm for \(\nu \)-support vector classification. IEEE Trans Neural Netw Learn Syst 28(5):1241–1248
Gu B, Sheng VS, Tay KY, Romano W, Li S (2015) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416
Gu B, Sun X, Sheng VS (2016) Structural minimax probability machine. IEEE Trans Neural Netw Learn Syst 28(7):1646–1656
Holmes WJ, Russell MJ (1999) Probabilistic-trajectory segmental HMMs. Comput Speech Lang 13(1):3–37
Keogh E, Chu S, Hart D, Pazzani M (2001) An online algorithm for segmenting time series. In: Data mining, 2001. ICDM 2001, Proceedings IEEE international conference on, IEEE, pp. 289–296
Keogh EJ, Pazzani MJ (2000) A simple dimensionality reduction technique for fast similarity search in large time series databases. In: Knowledge discovery and data mining. Current issues and new applications, Springer, pp. 122–133
Kim S, Smyth P (2006) Segmental hidden Markov models with random effects for waveform modeling. J Mach Learn Res 7:945–969
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proce IEEE 77(2):257–286
Schölkopf B, Smola AJ, Williamson RC, Bartlett PL (2000) New support vector algorithms. Neural Comput 12(5):1207–1245
Si YW, Yin J (2013) OBST-based segmentation approach to financial time series. Eng Appl Artif Intell 26(10):2581–2596
Wan Y, Gong X, Si YW (2016) Effect of segmentation on financial time series pattern matching. Appl Soft Comput 38:346–359
Wen X, Shao L, Xue Y, Fang W (2015) A rapid learning algorithm for vehicle classification. Inf Sci 295:395–406
Xia Z, Wang X, Sun X, Liu Q, Xiong N (2016) Steganalysis of lsb matching using differences between nonadjacent pixels. Multimed Tools Appl 75(4):1947–1962
Xing Z, Pei J, Keogh E (2010) A brief survey on sequence classification. ACM Sigkdd Explor Newsl 12(1):40–48
Yu SZ (2010) Hidden semi-Markov models. Artif Intell 174(2):215–243
Zapranis A, Samolada E (2007) Can neural networks learn the “Head and Shoulders” technical analysis price pattern? Towards a methodology for testing the efficient market hypothesis. In: Artificial neural networks–ICANN 2007, Springer, pp. 516–526
Zhang Z, Jiang J, Liu X, Lau R, Wang H, Zhang R (2010) A real time hybrid pattern matching scheme for stock time series. In: Proceedings of the twenty-first Australasian conference on database technologies-vol 104, Australian Computer Society, Inc, pp. 161–170
Zheng Y, Jeon B, Xu D, Wu Q, Zhang H (2015) Image segmentation by generalized hierarchical fuzzy c-means algorithm. J Intell Fuzzy Syst 28(2):961–973
Acknowledgements
This research was funded by the Research Committee of University of Macau, Grant MYRG2017-00029-FST and MYRG2016-00148-FST.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by V. Loia.
Appendices
Appendix A: How to set thresholds for the TB, ED and DTW approaches
We use the H&S-T pattern as an example to illustrate how we set the thresholds in the experiment. We began by generating four datasets containing one hundred time series (the top fifty were H&S-T positive time series and the bottom fifty were randomly generated negative time series) with different lengths of 19, 43, 85 and 127. We used the TB, ED and DTW approaches, respectively, to calculate the similarities between each time series in the four datasets. The top 50 were positive cases, and the distances had to be smaller than those in the bottom 50. As shown in Fig. 14a–d, the TB approach had a fixed threshold as the length of the time series increased. The threshold of the H&S-T pattern for TB \(\theta =0.1\). As shown in Fig. 15a–d, the threshold of the ED approach increased with the length of the time series. As shown in Fig 16a–d, the threshold of the DTW approach increased with the length of the time series. We modelled the threshold of the ED and DTW approach by a linear function of length. In Fig. 15a–d, the thresholds for the ED approach are 20, 40, 85 and 143 for lengths of 19, 43, 85 and 127, respectively. We regressed the threshold as a linear function of length, where the slope \(\alpha =1.1417\) and the intercept \(\beta =-6.2079\). For the DTW approach, in Fig 16a–d, the thresholds 6, 10, 22 and 34 and correspond to lengths of 19, 43, 85 and 127. In the regression linear function of length, the slope \(\gamma =0.2649\) and the intercept \(\varepsilon =-0.1457\).
Appendix B: Experimental settings for a synthetic dataset
The experiment settings are shown in Table 16
Rights and permissions
About this article
Cite this article
Wan, Y., Si, YW. A hidden semi-Markov model for chart pattern matching in financial time series. Soft Comput 22, 6525–6544 (2018). https://doi.org/10.1007/s00500-017-2703-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-017-2703-7