Knowledge discovery and variable scale evaluation for long series data

Zhai, Yanwei; Lv, Zheng; Zhao, Jun; Wang, Wei

doi:10.1007/s10462-022-10250-0

Knowledge discovery and variable scale evaluation for long series data

Published: 24 August 2022

Volume 56, pages 3157–3180, (2023)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Yanwei Zhai^1,2,
Zheng Lv ORCID: orcid.org/0000-0001-8393-7074^1,2,
Jun Zhao^1,2 &
…
Wei Wang^1,2

320 Accesses
1 Altmetric
Explore all metrics

Abstract

Knowledge discovery and evaluation is a challenging but rewarding process of obtaining available information automatically from database. Due to the heterogeneity of the collected data, the connotative knowledge has the characteristics of uncertainty, random occurrence and variable scale. Therefore, an unsupervised knowledge discovery and variable scale evaluation model is presented in this paper based on a new multi-feature fusion method. Firstly, point at the multiple information features, an amplitude-frequency-shape based state description form is proposed in this paper. It could analyze the time series from the aspects of energy, phase, and knowledge similarity. In view of the variable number and scale of knowledge fragments, a piecewise linear segmentation criterion is put forward based on the complexity and accuracy of information representation. Then a model free knowledge discovery framework without samples labels is constructed to discover the knowledge quickly and effectively. Aimed at the variable knowledge scale, a variable scale evaluation method is first proposed to distinguish the multi-scale decision-making knowledge based on the indicators of system stability and security. It could optimize the knowledge base and guide the decision-making process. The experimental results on heterogeneous activity datasets indicate that the proposed method here could generally analysis the time series state and discover the knowledge efficiently from massive data. In addition, the knowledge discovery and evaluation at a continuous decision system show that the proposed framework could meet the needs of knowledge discovery in complex environment and effectively distinguish the knowledge to provide strong support for establishing a credible decision-making system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Density-Based Clustering Based on Hierarchical Density Estimates

Uncertainty in big data analytics: survey, opportunities, and challenges

Article Open access 04 June 2019

A survey of methods for time series change point detection

Article 08 September 2016

Data availability

The UCI datasets analyzed during the current study are available from UCI Machine Learning Repository (http://archive.ics.uci.edu/ml). The blast furnace gas scheduling activities dataset that support the findings of this study is not openly available due to the sharing agreement with the enterprise.

References

Alzubaidi A, Tepper J, Lotfi A (2020) A novel deep mining model for effective knowledge discovery from omics data. Artif Intell Med 104:101821
Article Google Scholar
Azami H, Escudero J (2016) Amplitude-aware permutation entropy: Illustration in spike detection and signal segmentation. Comput Methods Programs Biomed 128:40–51
Article Google Scholar
Baccigalupi A, Liccardo A (2016) The Huang Hilbert Transform for evaluating the instantaneous frequency evolution of transient signals in non-linear systems. Measurement 86:1–13
Article Google Scholar
Bagnall A, Lines J, Hills J, Bostrom A (2015) Time-Series Classification with COTE: The collective of transformation-based ensembles. IEEE Trans Knowl Data Eng 27(9):2522–2535
Article Google Scholar
Bao C, Wu D, Li J (2019) A knowledge-based risk measure from the fuzzy multicriteria decision-making perspective. IEEE Trans Fuzzy Syst 27(5):1126–1138
Article Google Scholar
Bhattacharyya A, Pachori RB (2017) A multivariate approach for patient-specific EEG seizure detection using empirical wavelet transform. IEEE Trans Biomed Eng 64(9):2003–2015
Article Google Scholar
Casale P, Pujol O, Radeva P (2012) Personalization and user verification in wearable systems using biometric walking patterns. Pers Ubiquit Comput 16(5):563–580
Article Google Scholar
Cuesta-Frau D (2019) Permutation entropy: Influence of amplitude information on time series classification performance. Math Biosci Eng 16(6):6842–6857
Article MATH Google Scholar
Cuesta-Frau D, Miró-Martínez P, Oltra-Crespo S, Jordán-Núñez J, Vargas B, González P, Varela-Entrecanales M (2018) Model selection for body temperature signal classification using both amplitude and ordinality-based entropy measures. Entropy 20(11):853
Article Google Scholar
Deldari S, Smith DV, Sadri A, Salim F (2020) ESPRESSO: entropy and shape aware time-series segmentation for processing heterogeneous sensor data. Proc ACM Interact Mobile Wearable Ubiquitous Technol 4(3):1–24
Article Google Scholar
Gao Y, Lin J (2018) Exploring variable-length time series motifs in one hundred million length scale. Data Min Knowl Disc 32(5):1200–1228
Article MathSciNet Google Scholar
Gharghabi S, Yeh CM, Ding Y, Ding W, Hibbing P, LaMunion S, Kaplan A, Crouter SE, Keogh E (2019) Domain agnostic online semantic segmentation for multi-dimensional time series. Data Min Knowl Disc 33(1):96–130
Article MathSciNet MATH Google Scholar
Goldberger A, Amaral L, Glass L, Hausdorff J, Ivanov P C, Mark R et al (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online] 101(23):e215–e220
Grabocka J, Schilling N, Wistuba M, Schmidt-Thieme L (2014) Learning time-series shapelets. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD ’14). Association for Computing Machinery, New York, pp 392–401
Gupta A, Gupta HP, Biswas B, Dutta T (2021) A fault-tolerant early classification approach for human activities using multivariate time series. IEEE Trans Mob Comput 20(5):1747–1760
Article Google Scholar
He Y, Guo J, Zheng X (2018) From surveillance to digital twin: challenges and recent advances of signal processing for industrial internet of things. IEEE Signal Process Mag 35(5):120–129
Article Google Scholar
Imani S, Alaee S, Keogh E (2019) Putting the human in the time series analytics loop. In: Companion proceedings of the 2019 worldwideweb conference, San Francisco, CA, USA, 13–17 May 2019, pp 635–644
Kaluža B, Mirchevska V, Dovgan E, Luštrek M, Gams M (2010) An agent-based approach to care in independent living. In: International joint conference on ambient intelligence (AmI-2010), vol 6439. Springer, Berlin, pp 177–186
Leles MCR, Sansão JPH, Mozelli LA, Guimarãesd HN (2018) Improving reconstruction of time-series based in Singular Spectrum Analysis: a segmentation approach. Digital Signal Process 77:63–76
Article MathSciNet Google Scholar
Li G, Choi BKK, Xu J, Bhowmick SS, Chun K, Wong GL (2020) Efficient shapelet discovery for time series classification. IEEE Trans Knowl Data Eng 34(3):1149–1163
Article Google Scholar
Liu L, Wang S, Hu B, Qiong Q, Wen J, Rosenblume DS (2018) Learning structures of interval-based Bayesian networks in probabilistic generative model for human complex activity recognition. Pattern Recogn 81:545–561
Article Google Scholar
Lv Z, Zhao J, Liu Y, Wang W (2016) Use of a quantile regression based echo state network ensemble for construction of prediction intervals of gas flow in a blast furnace. Control Eng Pract 46:94–104
Article Google Scholar
Nancy JY, Khanna NH (2017) A bio-statistical mining approach for classifying multivariate clinical time series data observed at irregular intervals. Expert Syst Appl 78:283–300
Article Google Scholar
Nguyen TL, Gsponer S, Ilie I, O’Reilly M, Ifrim G (2019) Interpretable time series classification using linear models and multi-resolution multi-domain symbolic representations. Data Min Knowl Disc 33:1183–1222
Article MathSciNet MATH Google Scholar
Park H, Jae-Yoon J (2020) SAX-ARM: deviant event pattern discovery from multivariate time series using symbolic aggregate approximation and association rule mining. Expert Syst Appl 141:112950
Article Google Scholar
Patel SP, Upadhyay SH (2020) Euclidean distance based feature ranking and subset selection for bearing fault diagnosis. Expert Syst Appl 154:113400
Article Google Scholar
Pradhan GN, Prabhakaran B (2017) Association rule mining in multiple, multidimensional time series medical data. J Healthc Inf Res 1(1):92–118
Article Google Scholar
Reiss A, Stricker D (2012) Introducing a new benchmarked dataset for activity monitoring. In: The 16th IEEE international symposium on wearable computers (ISWC), Newcastle, UK, 18–22 June 2012, pp 108–109
Sadri A, Ren Y, Salim FD (2017) Information gain-based metric for recognizing transitions in human activities. Pervasive Mob Comput 38:92–109
Article Google Scholar
Sánchez P, Bellogín A (2020) Applying reranking strategies to route recommendation using sequence-aware evaluation. User Model User Adapt Interact 30(3):659–725
Article Google Scholar
Serrà J, Serra I, Corral Á, LluisArcos J (2016) Ranking and significance of variable-length similarity-based time series motifs. Expert Syst Appl 55:452–460
Article Google Scholar
Stisen A, Blunck H, Bhattacharya S, Prentow TS, Kjærgaard MB, Dey A, Sonne T, Jensen MM (2015) Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities for activity recognition. In: Proceedings of the 13th ACM conference on embedded networked sensor systems, Seoul South Korea, 1–4 November 2015, pp 127–140
Thuy HTT, Anh DT, Chau VTN (2017) Comparing three time series segmentation methods via novel evaluation criteria. In: 2017 2nd International conferences on information technology, information systems and electrical engineering (ICITISEE), Yogyakarta, Indonesia, 1–2 November 2017, pp 171–176
Wang H, Zhang Q, Wu J, Pan S, Chen Y (2019) Time series feature learning with labeled and unlabeled data. Pattern Recogn 89:55–66
Article Google Scholar
Yamaguchi A, Ueno K (2021) Learning time-series shapelets via supervised feature selection. In: Proceedings of the 2021 SIAM international conference on data mining (SDM). Society for Industrial and Applied Mathematics, Alexandria, VA, USA, pp 262–270
Yan B, Wang B, Zhou F, Li W, Xu B (2018) Sparse decomposition method based on time–frequency spectrum segmentation for fault signals in rotating machinery. ISA Trans 83:142–153
Article Google Scholar
Yu J, Liu G (2020) Knowledge-based deep belief network for machining roughness prediction and knowledge discovery. Comput Ind 121:103262
Article Google Scholar
Zhai Y, Lv Z, Zhao J, Wang W, Leung H (2022) Associative reasoning-based interpretable continuous decision making for long series data of industrial production process. Expert Syst Appl 204:117585
Article Google Scholar
Zhao J, Itti L (2016) Classifying time series using local descriptors with hybrid sampling. IEEE Trans Knowl Data Eng 28(3):623–637
Article Google Scholar
Zhao J, Wang W, Sun K, Liu Y (2014) A Bayesian networks structure learning and reasoning-based byproduct gas scheduling in steel industry. IEEE Trans Autom Sci Eng 11(4):1149–1154
Article Google Scholar

Download references

Acknowledgements

The authors wish to thank the Associate Editor and the anonymous reviewers for their valuable comments and constructive suggestions, which helped improve the presentation of the paper. This work was supported by the National Key R&D Program of China under Grant 2017YFA0700300, the National Natural Sciences Foundation of China under Grant 61833003, Grant 61873048, Grant U1908218, the Fundamental Research Funds for the Central Universities under Grant DUT22JC16.

Author information

Authors and Affiliations

Key Laboratory of Intelligent Control and Optimization for Industrial Equipment, Ministry of Education, Dalian University of Technology, Dalian, China
Yanwei Zhai, Zheng Lv, Jun Zhao & Wei Wang
School of Control Science and Engineering, Dalian University of Technology, Dalian, China
Yanwei Zhai, Zheng Lv, Jun Zhao & Wei Wang

Authors

Yanwei Zhai
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Lv
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zheng Lv.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhai, Y., Lv, Z., Zhao, J. et al. Knowledge discovery and variable scale evaluation for long series data. Artif Intell Rev 56, 3157–3180 (2023). https://doi.org/10.1007/s10462-022-10250-0

Download citation

Accepted: 02 August 2022
Published: 24 August 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10462-022-10250-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Knowledge discovery and variable scale evaluation for long series data

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Uncertainty in big data analytics: survey, opportunities, and challenges

A survey of methods for time series change point detection

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Knowledge discovery and variable scale evaluation for long series data

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Uncertainty in big data analytics: survey, opportunities, and challenges

A survey of methods for time series change point detection

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation