Skip to main content
Log in

Symbolic representation based on trend features for knowledge discovery in long time series

  • Published:
Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Abstract

The symbolic representation of time series has attracted much research interest recently. The high dimensionality typical of the data is challenging, especially as the time series becomes longer. The wide distribution of sensors collecting more and more data exacerbates the problem. Representing a time series effectively is an essential task for decision-making activities such as classification, prediction, and knowledge discovery. In this paper, we propose a new symbolic representation method for long time series based on trend features, called trend feature symbolic approximation (TFSA). The method uses a two-step mechanism to segment long time series rapidly. Unlike some previous symbolic methods, it focuses on retaining most of the trend features and patterns of the original series. A time series is represented by trend symbols, which are also suitable for use in knowledge discovery, such as association rules mining. TFSA provides the lower bounding guarantee. Experimental results show that, compared with some previous methods, it not only has better segmentation efficiency and classification accuracy, but also is applicable for use in knowledge discovery from time series.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agrawal, R., Srikant, R., 1995. Mining sequential patterns. Proc. 11th Int. Conf. on Data Engineering, p.3–14. [doi:10.1109/ICDE.1995.380415]

    Google Scholar 

  • André-Jönsson, H., Badal, D.Z., 1997. Using signature files for querying time-series data. Proc. 1st European Symp. on Principles of Data Mining and Knowledge Discovery, p.211–220. [doi:10.1007/3-540-63223-9_120]

    Chapter  Google Scholar 

  • Bao, D., Yang, Z., 2008. Intelligent stock trading system by turning point confirming and probabilistic reasoning. Expert Syst. Appl., 34(1):620–627. [doi:10.1016/j.eswa. 2006.09.043]

    Article  MathSciNet  Google Scholar 

  • Borgelt, C., Kruse, R., 2002. Induction of association rules: apriori implementation. Proc. Computational Statistics, p.395–400. [doi:10.1007/978-3-642-57489-4_59]

    Google Scholar 

  • Bu, Y., Chen, L., Fu, A.W.C., et al., 2009. Efficient anomaly monitoring over moving object trajectory streams. Proc. 15th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, p.159–168. [doi:10.1145/1557019.1557043]

    Google Scholar 

  • Chan, K.P., Fu, A.W.C., 1999. Efficient time series matching by wavelets. Proc. 15th Int. Conf. on Data Engineering, p.126–133. [doi:10.1109/ICDE.1999.754915]

    Google Scholar 

  • Dasgupta, D., Forrest, S., 1996. Novelty detection in time series data using ideas from immunology. Proc. 5th Int. Conf. on Intelligent Systems, p.82–87.

    Google Scholar 

  • Esling, P., Agon, C., 2012. Time-series data mining. ACM Comput. Surv., 45(1), Article 12. [doi:10.1145/2379776. 2379788]

  • Faloutsos, C., Ranganathan, M., Manolopoulos, Y., 1994. Fast subsequence matching in time-series databases. Proc. ACM SIGMOD Int. Conf. on Management of Data, p.419–429. [doi:10.1145/191839.191925]

    Google Scholar 

  • Guimarães, G., Ultsch, A., 1999. A method for temporal knowledge conversion. Proc. 3rd Int. Symp. on Advances in Intelligent Data Analysis, p.369–380. [doi:10.1007/3-540-48412-4_31]

    Chapter  Google Scholar 

  • Guimarães, G., Peter, J.H., Penzel, T., et al., 2001. A method for automated temporal knowledge acquisition applied to sleep-related breathing disorders. Artif. Intell. Med., 23(3):211–237. [doi:10.1016/S0933-3657(01)00089-6]

    Article  Google Scholar 

  • Kadous, M.W., 1999. Learning comprehensible descriptions of multivariate time series. Proc. 16th Int. Conf. of Machine Learning, p.454–463.

    Google Scholar 

  • Keogh, E., Chakrabarti, K., Pazzani, M., et al., 2001. Locally adaptive dimensionality reduction for indexing large time series databases. Proc. ACM SIGMOD Int. Conf. on Management of Data, p.151–162. [doi:10.1145/375663.375680]

    Google Scholar 

  • Kontaki, M., Papadopoulos, A.N., Manolopoulos, Y., 2005. Continuous trend-based classification of streaming time series. Proc. 9th East European Conf. on Advances in Databases and Information Systems, p.294–308. [doi:10. 1007/11547686_22]

    Chapter  Google Scholar 

  • Kontaki, M., Papadopoulos, A.N., Manolopoulos, Y., 2008. Continuous trend-based clustering in data streams. Proc. 10th Int. Conf. on Data Warehousing and Knowledge Discovery, p.251–262. [doi:10.1007/978-3-540-85836-2_24]

    Chapter  Google Scholar 

  • Korn, F., Jagadish, H.V., Faloutsos, C., 1997. Efficiently supporting ad hoc queries in large datasets of time sequences. Proc. ACM SIGMOD Int. Conf. on Management of Data, p.289–300. [doi:10.1145/253260.253332]

    Google Scholar 

  • Lavielle, M., Teyssière, G., 2006. Detection of multiple change-points in multivariate time series. Lithuan. Math. J., 46(3):287–306. [doi:10.1007/s10986-006-0028-9]

    Article  MATH  Google Scholar 

  • Lin, J., Keogh, E., Lonardi, S., et al., 2003. A symbolic representation of time series, with implications for streaming algorithms. Proc. 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, p.2–11. [doi:10.1145/882082.882086]

    Chapter  Google Scholar 

  • Manganaris, S., 1997. Supervised Classification with Temporal Data. PhD Thesis, Vanderbilt University, USA.

    Google Scholar 

  • Mannila, H., Toivonen, H., 1996. Discovering generalized episodes using minimal occurrences. Proc. Int. Conf. on Knowledge Discovery and Data Mining, p.146–151.

    Google Scholar 

  • Mellit, A., Pavan, A.M., Benghanem, M., 2013. Least squares support vector machine for short-term prediction of meteorological time series. Theor. Appl. Climatol., 111(1–2): 297–307. [doi:10.1007/s00704-012-0661-7]

    Article  Google Scholar 

  • Moody, G.B., Mark, R.G., 1983. A new method for detecting atrial fibrillation using RR intervals. Comput. Cardiol., 10:227–230.

    Google Scholar 

  • Phetking, C., Noor Md Sap, M., Selamat, A., 2008. A multiresolution important point retrieval method for financial time series representation. Proc. Int. Conf. on Computer and Communication Engineering, p.510–515. [doi:10. 1109/ICCCE.2008.4580656]

    Google Scholar 

  • Poll, S., de Kleer, J., Feldman, A., et al., 2010. Second international diagnostics competition—DXC’10. Proc. 21st Int. Workshop on Principles of Diagnosis, p.1–15.

    Google Scholar 

  • Sarkar, S., Mukherjee, K., Sarkar, S., et al., 2013. Symbolic dynamic analysis of transient time series for fault detection in gas turbine engines. J. Dynam. Syst., Meas. Contr., 135(1):014506.1–014506.6. [doi:10.1115/1.4007699]

    Google Scholar 

  • Villafane, R., Hua, K.A., Tran, D., et al., 2000. Knowledge discovery from series of interval events. J. Intell. Inform. Syst., 15(1):71–89. [doi:10.1023/A:1008781812242]

    Article  Google Scholar 

  • Vullings, H.J.L.M., Verhaegen, M.H.G., Verbruggen, H.B., 1997. ECG segmentation using time-warping. Proc. 2nd Int. Symp. on Advances in Intelligent Data Analysis Reasoning about Data, p.275–285. [doi:10.1007/BFb0052847]

    Chapter  Google Scholar 

  • Yeh, A.B., Lin, D.K.J., Venkataramani, C., 2004. Unified CUSUM charts for monitoring process mean and variability. Qual. Technol. Quant. Manag., 1(1):65–86.

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hong Yin.

Additional information

Project supported by the National High-Tech R&D Program (863) of China (Nos. 2012AA012600, 2011AA010702, 2012AA01A401, and 2012AA01A402), the National Natural Science Foundation of China (No. 60933005), and the National Science and Technology of China (No. 2012BAH38B04)

ORCID: Hong YIN, http://orcid.org/0000-0002-0682-6781

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yin, H., Yang, Sq., Zhu, Xq. et al. Symbolic representation based on trend features for knowledge discovery in long time series. Frontiers Inf Technol Electronic Eng 16, 744–758 (2015). https://doi.org/10.1631/FITEE.1400376

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/FITEE.1400376

Keywords

CLC number

Navigation