Mining Sequential Patterns across Time Sequences

Chen, Gong; Wu, Xindong; Zhu, Xingquan

doi:10.1007/s00354-007-0036-2

Mining Sequential Patterns across Time Sequences

Published: 14 March 2008

Volume 26, pages 75–96, (2007)
Cite this article

New Generation Computing Aims and scope Submit manuscript

Gong Chen¹,
Xindong Wu² &
Xingquan Zhu³

123 Accesses
3 Citations
Explore all metrics

Abstract

In this paper, we deal with mining sequential patterns in multiple time sequences. Building on a state-of-the-art sequential pattern mining algorithm PrefixSpan for mining transaction databases, we propose MILE (MIning in muLtiple sEquences), an efficient algorithm to facilitate the mining process. MILE recursively utilizes the knowledge of existing patterns to avoid redundant data scanning, and therefore can effectively speed up the new patterns’ discovery process. Another unique feature of MILE is that it can incorporate prior knowledge of the data distribution in time sequences into the mining process to further improve the performance. Extensive empirical results show that MILE is significantly faster than PrefixSpan. As MILE consumes more memory than PrefixSpan, we also present a solution to trade time efficiency in memory constrained environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agrawal, R. and Srikant, R., “Mining Sequential Patterns,” in Proc. of the 11th Int’l Conf. on Data Engineering, pp. 3-14, 1995.
Ayres, J., Flannick, J., Gehrke, J. and Yiu, T., “Sequential Pattern Mining Using a Bitmap Representation,” in Proc. of the 8th Int’l Conf. on Knowledge Discovery and Data Mining, pp. 429-435, 2002.
Bettini, C., Wang, X.S., Jajodia, S. and Lin, J., “Discovering Frequent Event Patterns with Multiple Granularities in Time Sequences”, IEEE Transactions on Knowledge and Data Engineering, 10- 2, pp. 222-237, 1998.
Article Google Scholar
Charikar,M., Chen, K. and Farach-Colton, M., “Finding Frequent Items in Data Streams,” in Proc. of Int’l Colloquium on Automata, Languages and Programming, pp. 508-515, 2002.
Das, G., Lin, K., Mannila, H., Renganathan, G. and Smyth, P., “Rule Discovery from Time Series,” in Proc. of the 4th Int’l Conf. of Knowledge Discovery and Data Mining, pp. 16-22, 1998.
Gao, L., and Wang, X.S., “Continually Evaluating Similarity-based Pattern Queries on a Streaming Time Series,” in Proc. of the 2002 ACM SIGMOD Int’l Conf. on Management of Data, pp. 370-381. 2002.
Keogh, E. and Lin, J., “Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research,” Knowledge and Information Systems, 8- 2, pp. 154-177, 2005.
Article Google Scholar
Keogh, E., and Smyth, P., “A Probabilistic Approach to Fast Pattern Matching in Time Series Databases,” in Proc. of the 3rd Int’l Conf. of Knowledge Discovery and Data Mining, pp. 16-22, 1997.
Manku,G. S., and Motwani, R., “Approximate Frequency Counts over Data Streams,” in Proc. of the 28th Int’l Conf. on Very Large Data Bases, pp. 346-357, 2002.
Mannila, H., Toivonen, H. and Verkamo, A.I., “Discovery of Frequent Episodes in Event Sequences,” Data Mining and Knowledge Discovery, 1-3, pp. 259-289, 1997.
Article Google Scholar
Oates, T. and Cohen, P.R., “Searching for Structure in Multiple Streams of Data,” in Proc. of the 13th Int’l Conf. on Machine Learning, pp. 346-354, 1996.
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H.,Chen, Q., Dayal, U. and Hsu, M., “PrefixSpan: Mining Sequential Patterns Efficiently by Prefix Projected Pattern Growth,” in Proc. of the 17th Int’l Conf. on Data Engineering, pp. 215-226, 2001.
Pei, J., Han, J., Mortazavi-Asl, B., Wang, J., Pinto, H., Chen, Q., Dayal, U. and Hsu, M., “Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach,” IEEE Transactions on Knowledge and Data Engineering, 16-11, pp. 1424-1440, 2004.
Google Scholar
Srikant, R. and Agrawal, R., “Mining Sequential Patterns: Generalized and Performance Improvements,” in Proc. of the 5th Int’l Conf. on Extending Database Technology, pp. 3-17, 1996.
Wang, M. and Wang, X.S., “Efficient Evaluation of Composite Correlations for Streaming Time Series,” in Proc. of the 4th Int’l Conf. on Web-Age Information Management, pp. 369-380, 2003.
Yang, Y., Webb, G. and Wu, X, “Discretization Methods,” in Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers (O. Maimon and L. Rokach eds.), Kluwer Academic Publishers, 2005.
Yi, B., Sidiropoulos, N., Johnson, W., Jagadish, H.V., Faloutsos, C. and Biliris, A., “Online Data Mining for Co-Evolving Time Sequences,” in Proc. of the 16th Int’l Conf. on Data Engineering, pp. 13-22, 2000.
Zaki, M. J., “Efficient Enumeration of Frequent Sequences,” in Proc. of the 7th Int’l Conf. on Information and Knowledge Management, pp. 68-75, 1998.
Zaki, M. J., “SPADE: An Efficient Algorithm for Mining Frequent Sequences,” Machine Learning, 42-1/2, pp. 31-60, 2001.
Article Google Scholar
Zhu, Y. and Shasha, D., “StartStream: Statistical Monitoring of Thousands of Data Streams in Real Time,” in Proc. of the 28th Int’l Conf. on Very Large Data Bases, pp. 358-369, 2002.

Download references

Author information

Authors and Affiliations

Department of Statistics, University of California, Los Angeles, 8951 Mathematical Sciences Building, Los Angeles, CA, 90095, USA
Gong Chen
Department of Computer Science, University of Vermont, 33 Colchester Avenue, Burlington, VT, 05405, USA
Xindong Wu
Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL, 33431, USA
Xingquan Zhu

Authors

Gong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xindong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xingquan Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gong Chen.

About this article

Cite this article

Chen, G., Wu, X. & Zhu, X. Mining Sequential Patterns across Time Sequences. New Gener. Comput. 26, 75–96 (2007). https://doi.org/10.1007/s00354-007-0036-2

Download citation

Received: 07 February 2006
Revised: 06 February 2007
Published: 14 March 2008
Issue Date: November 2007
DOI: https://doi.org/10.1007/s00354-007-0036-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining Sequential Patterns across Time Sequences

Abstract

Access this article

Similar content being viewed by others

Incremental Mining of Frequent Serial Episodes Considering Multiple Occurrences

Sequential Pattern Mining

An Empirical Evaluation of Sequential Pattern Mining Algorithms

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Keywords

Navigation

Mining Sequential Patterns across Time Sequences

Abstract

Access this article

Similar content being viewed by others

Incremental Mining of Frequent Serial Episodes Considering Multiple Occurrences

Sequential Pattern Mining

An Empirical Evaluation of Sequential Pattern Mining Algorithms

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Share this article

Keywords

Search

Navigation