Selection of Relevant and Non-Redundant Multivariate Ordinal Patterns for Time Series Classification

Shekar, Arvind Kumar; Pappik, Marcus; Iglesias Sánchez, Patricia; Müller, Emmanuel

doi:10.1007/978-3-030-01771-2_15

Arvind Kumar Shekar^17,18,
Marcus Pappik¹⁸,
Patricia Iglesias Sánchez¹⁷ &
…
Emmanuel Müller¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11198))

Included in the following conference series:

International Conference on Discovery Science

975 Accesses
3 Citations
2 Altmetric

Abstract

Transformation of multivariate time series into feature spaces are common for data mining tasks like classification. Ordinality is one important property in time series that provides a qualitative representation of the underlying dynamic regime. In a multivariate time series, ordinalities from multiple dimensions combine together to be discriminative for the classification problem. However, existing works on ordinality do not address the multivariate nature of the time series. For multivariate ordinal patterns, there is a computational challenge with an explosion of pattern combinations, while not all patterns are relevant and provide novel information for the classification. In this work, we propose a technique for the extraction and selection of relevant and non-redundant multivariate ordinal patterns from the high-dimensional combinatorial search space. Our proposed approach Ordinal feature extraction (ordex), simultaneously extracts and scores the relevance and redundancy of ordinal patterns without training a classifier. As a filter-based approach, ordex aims to select a set of relevant patterns with complementary information. Hence, using our scoring function based on the principles of Chebyshev’s inequality, we maximize the relevance of the patterns and minimize the correlation between them. Our experiments on real world datasets show that ordinality in time series contains valuable information for classification in several applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
As \(t=1\) and 2 have less than \(d-1\) preceding values.
2.
In Fig. 1, \(\mathbb {O}_{d=3}(X,t=4)=X(t)>X(t-1)>X(t-(3-1))=012\).
3.
https://hpi.de//mueller/ordex.html.
4.
https://figshare.com/s/4023c66f7a87b59628b4.
5.
https://github.com/KDD-OpenSource/data-generation.

References

Bandt, C., Pompe, B.: Permutation entropy: a natural complexity measure for time series. Phys. Rev. Lett. 88(17), 174102 (2002)
Article Google Scholar
Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene expression data. J. Bioinf. Comput. Biol. 3(02), 185–205 (2005)
Article Google Scholar
Fulcher, B.D., Jones, N.S.: Highly comparative feature-based time-series classification. IEEE Trans. Knowl. Data Eng. 26(12), 3026–3037 (2014)
Article Google Scholar
Graff, G., et al.: Ordinal pattern statistics for the assessment of heart rate variability. Eur. Phys. J. Spec. Top. 222(2), 525–534 (2013)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hollander, M., Wolfe, D.A., Chicken, E.: Nonparametric Statistical Methods. Wiley, New York (2013)
MATH Google Scholar
Karlin, S., Studden, W.J.: Tchebycheff Systems: With Applications in Analysis and Statistics. Interscience, New York (1966)
MATH Google Scholar
Kate, R.J.: Using dynamic time warping distances as features for improved time series classification. Data Min. Knowl. Discov. 30(2), 283–312 (2016)
Article MathSciNet Google Scholar
Keller, F., Müller, E., Bohm, K.: Hics: high contrast subspaces for density-based outlier ranking. In: 2012 IEEE 28th International Conference on Data Engineering, pp. 1037–1048. IEEE (2012)
Google Scholar
Lichman, M.: UCI Machine Learning Repository (2013). http://archive.ics.uci.edu/ml
Lin, J., Khade, R., Li, Y.: Rotation-invariant similarity in time series using bag-of-patterns representation. J. Intell. Inf. Syst. 39(2), 287–315 (2012)
Article Google Scholar
Mörchen, F.: Time series feature extraction for data mining using DWT and DFT (2003)
Google Scholar
Nanopoulos, A., Alcock, R., Manolopoulos, Y.: Feature-based classification of time-series data. Int. J. Comput. Res. 10(3), 49–61 (2001)
Google Scholar
Saito, N.: Local feature extraction and its applications using a library of bases. Topics in Analysis and Its Applications: Selected Theses, pp. 269–451 (2000)
Chapter Google Scholar
Shekar, A.K., Bocklisch, T., Sánchez, P.I., Straehle, C.N., Müller, E.: Including multi-feature interactions and redundancy for feature ranking in mixed datasets. In: Ceci, M., Hollmén, J., Todorovski, L., Vens, C., Džeroski, S. (eds.) ECML PKDD 2017. LNCS (LNAI), vol. 10534, pp. 239–255. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71249-9_15
Chapter Google Scholar
Sinn, M., Ghodsi, A., Keller, K.: Detecting change-points in time series by maximum mean discrepancy of ordinal pattern distributions. In: UAI 2012 Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (2012)
Google Scholar
Wang, X., Smith, K., Hyndman, R.: Characteristic-based clustering for time series data. Data Min. Knowl. Discov. 13(3), 335–364 (2006)
Article MathSciNet Google Scholar
Wang, X., Wirth, A., Wang, L.: Structure-based statistical features and multivariate time series clustering. In: Seventh IEEE International Conference on Data Mining, 2007, ICDM 2007, pp. 351–360. IEEE (2007)
Google Scholar
Wei, Y., Jiao, L., Wang, S., Chen, Y., Liu, D.: Time series classification with max-correlation and min-redundancy shapelets transformation. In: 2015 International Conference on Identification, Information, and Knowledge in the Internet of Things (IIKI), pp. 7–12. IEEE (2015)
Google Scholar
Xi, X., Keogh, E., Wei, L., Mafra-Neto, A.: Finding motifs in a database of shapes. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 249–260. SIAM (2007)
Google Scholar
Ye, L., Keogh, E.: Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 947–956. ACM (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Robert Bosch GmbH, Stuttgart, Germany
Arvind Kumar Shekar & Patricia Iglesias Sánchez
Hasso Plattner Institute, Potsdam, Germany
Arvind Kumar Shekar, Marcus Pappik & Emmanuel Müller

Authors

Arvind Kumar Shekar
View author publications
You can also search for this author in PubMed Google Scholar
Marcus Pappik
View author publications
You can also search for this author in PubMed Google Scholar
Patricia Iglesias Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
Emmanuel Müller
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arvind Kumar Shekar .

Editor information

Editors and Affiliations

Goldsmiths University of London, London, UK
Larisa Soldatova
Eindhoven University of Technology, Eindhoven, The Netherlands
Joaquin Vanschoren
University of Cyprus, Nicosia, Cyprus
George Papadopoulos
Università degli Studi di Bari Aldo Moro, Bari, Italy
Michelangelo Ceci

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shekar, A.K., Pappik, M., Iglesias Sánchez, P., Müller, E. (2018). Selection of Relevant and Non-Redundant Multivariate Ordinal Patterns for Time Series Classification. In: Soldatova, L., Vanschoren, J., Papadopoulos, G., Ceci, M. (eds) Discovery Science. DS 2018. Lecture Notes in Computer Science(), vol 11198. Springer, Cham. https://doi.org/10.1007/978-3-030-01771-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-01771-2_15
Published: 07 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01770-5
Online ISBN: 978-3-030-01771-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics