Abstract
Classification of time series is an important task with many challenging applications like brain wave (EEG) analysis, signature verification or speech recognition. In this paper we show how characteristic local patterns (motifs) can improve the classification accuracy. We introduce a new motif class, generalized semi-continuous motifs. To allow flexibility and noise robustness, these motifs may include gaps of various lengths, generic and more specific wildcards. We propose an efficient algorithm for mining generalized sequential motifs. In experiments on real medical data, we show how generalized semi-continuous motifs improve the accuracy of SVMs and Bayesian Networks for time series classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules in large databases. In 20th International Conference on Very Large Data Bases (pp. 487–499).
Bodon, F. (2005). A trie-based APRIORI implementation for mining frequent item sequences. In 1st International Workshop on Open Source Data Mining: Frequent Pattern Mining Implementations (pp. 56–65). Chicago, IL.
Borgelt, C. (2003). Efficient implementations of apriori and eclat. In Workshop of Frequent Item Set Mining Implementations. Melbourne, FL, USA.
Borgelt, C. (2004). Recursion pruning for the apriori algorithm. In 2nd Workshop of Frequent Item Set Mining Implementations. Brighton, UK.
Buhler, J., & Tompa, M. (2002). Finding motifs using random projections. Journal of Computational Biology, 9(2), 225–242.
Dzeroski, S., Slavkov, I., Gjorgjioski, V., & Struyf, J. (2006). Analysis of time series data with predictive clustering trees. In 5th International Workshop on Knowledge Discovery in Inductive Databases (pp. 47–58). Berlin, Germany.
Ferreira, P. G., & Azevedo, P. J. (2005). Protein sequence classification through relevant sequence mining and Bayes classifiers. In 12th Portuguese Conference on AI.
Ferreira, P. G., Azevedo, P. J., Silva, C. G., & Brito, R. M. M. (2006). Mining approximate motifs in time series. In 9th International Conference on Discovery Science. Barcelona.
Futschik, M. E., & Carlisle, B. (2005). Noise-robust soft clustering of gene expression time-course data. Bioinformatics and Computational Biology, 3, 965–988.
Gaul, W., & Schmidt-Thieme, L. (2001). Mining generalized association rules for sequential and path data. In IEEE ICMD (pp. 593–596). San Jose.
Gruber, C., Coduro, M., & Sick, B. (2006). Signature verification with dynamic RBF networks and time series motifs. In 10th International Workshop on Frontiers in Handwriting Recognition.
Hipp, J., Myka, A., Wirth, R., & Gntzer, U. (1998). A new algorithm for faster mining of generalized association rules. In PKDD (pp. 74–82). Nantes, France.
Jensen, K. L., Styczynski, M. P., Rigoutsos, I., & Stephanopoulos, G. N. (2006). A generic motif discovery algorithm for sequential data. Bioinformatics, 22, 21–28.
Keogh, E. J., & Pazzani, M. J. (2000). Scaling up dynamic time warping for datamining applications. In KDD (pp. 285–289). Boston, MA, USA.
Knorr, T. (2006a). Identifying patients at risk: Mining dialysis treatment data. In 2nd German Japanese Symposium on Classification. Berlin.
Knorr, T. (2006b). Motif discovery in multivariate time series and application to hemodialysis treatment data. MSc Thesis, Albert-Ludwigs-University, Freiburg.
Kunik, V., Solan, Z., Edelman, S., Ruppin, E., & Horn, D. (2005). Motif extraction and protein classification. In IEEE Computational Systems Bioinformatics Conference.
Lin, J., Keogh, E., Lonardi, S., & Chiu, B. (2003). A symbolic representation of time series, with implications for streaming algorithms. In 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.
Manmatha, R., & Rath, T. M. (2003). Indexing of handwritten historical documents – Recent progress. In Symposium on Document Image Understanding Technology (pp. 77–85). Greenbelt, MD.
Marcel, S., & Millan, J. R. (2007). Person authentication using brainwaves (EEG) and maximum a posteriori model adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 743–752.
Patel, P., Keogh, E., Lin, J., & Lonardi, S. (2002). Mining motifs in massive time series databases. In IEEE ICDM.
Pei, J., Han, J., Wang, J., Pinto, H., Chen, Q., Dayal, U., et al. (2004). Mining sequential patterns by pattern-growth: The prefixspan approach. IEEE Transactions on Knowledge and Data Engineering, 16, 1424–1440.
Pramudiono, I., & Kitsuregawa, M. (2004). FP-tax: Tree structure based generalized association rule mining. In ACM/SIGMOD International Workshop on Research Issues on Data Mining and Knowledge Discovery (pp. 60–63). Paris.
Ratanamahatana, C. A., & Keogh, E. (2004a). Everything you know about dynamic time warping is wrong. In SIGKDD Workshop on Mining Temporal and Sequential Data.
Ratanamahatana, C. A., & Keogh, E. (2004b). Making time-series classification more accurate using learned constraints. In SIAM International Conference on Data Mining.
Rath, T. M., & Manmatha, R. (2003). Word image matching using dynamic time wrapping. CVPR, II, 521–527.
Sakoe, H., & Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-26, 43–49.
Srikant, R., & Agrawal, R. (1996). Mining sequential patterns: Generalizations and performance improvements. In EDBT. Avignon, France.
Sriphaew, K., & Theeramunkong, T. (2002). A new method for finding generalized frequent itemsets in generalizes association rule mining. In ISCC (pp. 1040–1045). Taormina, Italy.
Sriphaew, K., & Theeramunkong, T. (2004). Fast algorithms for mining generalized frequent patterns of generalized association rules. IEICE Transactions on Information and Systems, E87-D(3), 761–770.
Yankov, D., Keogh, E., Medina, J., Chiu, B., & Zordan, V. (2007). Detecting time series motifs under uniform scaling. In KDD (pp. 844–853). San Jose, CA, USA.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Buza, K., Schmidt-Thieme, L. (2009). Motif-Based Classification of Time Series with Bayesian Networks and SVMs. In: Fink, A., Lausen, B., Seidel, W., Ultsch, A. (eds) Advances in Data Analysis, Data Handling and Business Intelligence. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01044-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-01044-6_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01043-9
Online ISBN: 978-3-642-01044-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)