Abstract
Time series motifs are an integral part of diverse data mining applications including classification, summarization and near-duplicate detection. These are used across wide variety of domains such as image processing, bioinformatics, medicine, extreme weather prediction, the analysis of web log and customer shopping sequences, the study of XML query access patterns, electroencephalograph interpretation and entomological telemetry data mining. Exact Motif discovery in soft real-time over 100K time series is a challenging problem. We present novel parallel algorithms for soft real-time exact motif discovery on multi-core architectures. Experimental results on large scale P6 SMP system, using real life and synthetic time series data, demonstrate the scalability of our algorithms and their ability to discover motifs in soft real-time. To the best of our knowledge, this is the first such work on parallel scalable soft real-time exact motif discovery.
Chapter PDF
References
Beaudoin, P., van de Panne, M., Poulin, P., Coros, S.: Motion-motif graphs. In: Symposium on Computer Animation (2008)
Chiu, B., Keogh, E., Lonardi, S.: Probabilistic discovery of time series motifs. In: 9th International Conference on Knowledge Discovery and Data mining (KDD 2003), pp. 493–498 (2003)
Guralnik, V., Karypis, G.: Parallel tree-projection-based sequence mining algorithms. Parallel Computing 30(4), 443–472 (2001)
Guyet, T., Garbay, C., Dojat, M.: Knowledge construction from time series data using a collaborative exploration system. Journal of Biomedical Informatics 40(6), 672–687 (2007)
Jiang, T., Feng, Y., Zhang, B., Shi, J., Wang, Y.: Finding motifs of financial data streams in real time. In: Kang, L., Cai, Z., Yan, X., Liu, Y. (eds.) ISICA 2008. LNCS, vol. 5370, pp. 546–555. Springer, Heidelberg (2008)
Meng, J., Yuan, J.: Hans, M., Wu, Y.: Mining motifs from human motion. In: Proc. of EUROGRAPHICS (2008)
Minnen, D., Isbell, C., Essa, I., Starner, T.: Discovering multivariate motifs using subsequence density estimation and greedy mixture learning. In: Conf. on Artificial Intelligence, AAAI 2007 (2007)
Mueen, A., Keogh, E.J., Zhu, Q., Cash, S., Westover, M.B.: Exact discovery of time series motifs. In: SDM, pp. 473–484 (2009)
Cong, S., Han, J., Padua, D.: Parallel mining of closed sequential patterns. In: Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, Chicago, USA, pp. 562–567 (2005)
Tanaka, Y., Iwamoto, K., Uehara, K.: Discovery of time-series motif from multi-dimensional data based on mdl principle. Machine Learning 58(2-3), 269–300 (2005)
Tata, S.: Declarative Querying For Biological Sequences. Ph.D. thesis. The University of Michigan (2007)
Ueno, K., Xi, X., Keogh, E., Lee, D.: Anytime classification using the nearest neighbor algorithm with applications to stream mining. In: Proc. of IEEE International Conference on Data Mining (2006)
Zaki, M.: Parallel sequence mining on shared-memory machines. Journal of Parallel and Distributed Computing 61(3), 401–426 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Narang, A., Bhattacherjee, S. (2010). Parallel Exact Time Series Motif Discovery. In: D’Ambra, P., Guarracino, M., Talia, D. (eds) Euro-Par 2010 - Parallel Processing. Euro-Par 2010. Lecture Notes in Computer Science, vol 6272. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15291-7_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-15291-7_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15290-0
Online ISBN: 978-3-642-15291-7
eBook Packages: Computer ScienceComputer Science (R0)