Abstract
Transformation of multivariate time series into feature spaces are common for data mining tasks like classification. Ordinality is one important property in time series that provides a qualitative representation of the underlying dynamic regime. In a multivariate time series, ordinalities from multiple dimensions combine together to be discriminative for the classification problem. However, existing works on ordinality do not address the multivariate nature of the time series. For multivariate ordinal patterns, there is a computational challenge with an explosion of pattern combinations, while not all patterns are relevant and provide novel information for the classification. In this work, we propose a technique for the extraction and selection of relevant and non-redundant multivariate ordinal patterns from the high-dimensional combinatorial search space. Our proposed approach Ordinal feature extraction (ordex), simultaneously extracts and scores the relevance and redundancy of ordinal patterns without training a classifier. As a filter-based approach, ordex aims to select a set of relevant patterns with complementary information. Hence, using our scoring function based on the principles of Chebyshev’s inequality, we maximize the relevance of the patterns and minimize the correlation between them. Our experiments on real world datasets show that ordinality in time series contains valuable information for classification in several applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
As \(t=1\) and 2 have less than \(d-1\) preceding values.
- 2.
In Fig. 1, \(\mathbb {O}_{d=3}(X,t=4)=X(t)>X(t-1)>X(t-(3-1))=012\).
- 3.
- 4.
- 5.
References
Bandt, C., Pompe, B.: Permutation entropy: a natural complexity measure for time series. Phys. Rev. Lett. 88(17), 174102 (2002)
Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene expression data. J. Bioinf. Comput. Biol. 3(02), 185–205 (2005)
Fulcher, B.D., Jones, N.S.: Highly comparative feature-based time-series classification. IEEE Trans. Knowl. Data Eng. 26(12), 3026–3037 (2014)
Graff, G., et al.: Ordinal pattern statistics for the assessment of heart rate variability. Eur. Phys. J. Spec. Top. 222(2), 525–534 (2013)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Hollander, M., Wolfe, D.A., Chicken, E.: Nonparametric Statistical Methods. Wiley, New York (2013)
Karlin, S., Studden, W.J.: Tchebycheff Systems: With Applications in Analysis and Statistics. Interscience, New York (1966)
Kate, R.J.: Using dynamic time warping distances as features for improved time series classification. Data Min. Knowl. Discov. 30(2), 283–312 (2016)
Keller, F., Müller, E., Bohm, K.: Hics: high contrast subspaces for density-based outlier ranking. In: 2012 IEEE 28th International Conference on Data Engineering, pp. 1037–1048. IEEE (2012)
Lichman, M.: UCI Machine Learning Repository (2013). http://archive.ics.uci.edu/ml
Lin, J., Khade, R., Li, Y.: Rotation-invariant similarity in time series using bag-of-patterns representation. J. Intell. Inf. Syst. 39(2), 287–315 (2012)
Mörchen, F.: Time series feature extraction for data mining using DWT and DFT (2003)
Nanopoulos, A., Alcock, R., Manolopoulos, Y.: Feature-based classification of time-series data. Int. J. Comput. Res. 10(3), 49–61 (2001)
Saito, N.: Local feature extraction and its applications using a library of bases. Topics in Analysis and Its Applications: Selected Theses, pp. 269–451 (2000)
Shekar, A.K., Bocklisch, T., Sánchez, P.I., Straehle, C.N., Müller, E.: Including multi-feature interactions and redundancy for feature ranking in mixed datasets. In: Ceci, M., Hollmén, J., Todorovski, L., Vens, C., Džeroski, S. (eds.) ECML PKDD 2017. LNCS (LNAI), vol. 10534, pp. 239–255. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71249-9_15
Sinn, M., Ghodsi, A., Keller, K.: Detecting change-points in time series by maximum mean discrepancy of ordinal pattern distributions. In: UAI 2012 Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (2012)
Wang, X., Smith, K., Hyndman, R.: Characteristic-based clustering for time series data. Data Min. Knowl. Discov. 13(3), 335–364 (2006)
Wang, X., Wirth, A., Wang, L.: Structure-based statistical features and multivariate time series clustering. In: Seventh IEEE International Conference on Data Mining, 2007, ICDM 2007, pp. 351–360. IEEE (2007)
Wei, Y., Jiao, L., Wang, S., Chen, Y., Liu, D.: Time series classification with max-correlation and min-redundancy shapelets transformation. In: 2015 International Conference on Identification, Information, and Knowledge in the Internet of Things (IIKI), pp. 7–12. IEEE (2015)
Xi, X., Keogh, E., Wei, L., Mafra-Neto, A.: Finding motifs in a database of shapes. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 249–260. SIAM (2007)
Ye, L., Keogh, E.: Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 947–956. ACM (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Shekar, A.K., Pappik, M., Iglesias Sánchez, P., Müller, E. (2018). Selection of Relevant and Non-Redundant Multivariate Ordinal Patterns for Time Series Classification. In: Soldatova, L., Vanschoren, J., Papadopoulos, G., Ceci, M. (eds) Discovery Science. DS 2018. Lecture Notes in Computer Science(), vol 11198. Springer, Cham. https://doi.org/10.1007/978-3-030-01771-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-01771-2_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01770-5
Online ISBN: 978-3-030-01771-2
eBook Packages: Computer ScienceComputer Science (R0)