Efficient frequent sequence mining by a dynamic strategy switching algorithm

Chiu, Ding-Ying; Wu, Yi-Hung; Chen, Arbee L. P.

doi:10.1007/s00778-008-0100-7

Efficient frequent sequence mining by a dynamic strategy switching algorithm

Regular Paper
Published: 26 April 2008

Volume 18, pages 303–327, (2009)
Cite this article

The VLDB Journal Aims and scope Submit manuscript

Ding-Ying Chiu¹,
Yi-Hung Wu² &
Arbee L. P. Chen³

149 Accesses
6 Citations
Explore all metrics

Abstract

Mining frequent sequences in large databases has been an important research topic. The main challenge of mining frequent sequences is the high processing cost due to the large amount of data. In this paper, we propose a novel strategy to find all the frequent sequences without having to compute the support counts of non-frequent sequences. The previous works prune candidate sequences based on the frequent sequences with shorter lengths, while our strategy prunes candidate sequences according to the non-frequent sequences with the same lengths. As a result, our strategy can cooperate with the previous works to achieve a better performance. We then identify three major strategies used in the previous works and combine them with our strategy into an efficient algorithm. The novelty of our algorithm lies in its ability to dynamically switch from a previous strategy to our new strategy in the mining process for a better performance. Experiment results show that our algorithm outperforms the previous ones under various parameter settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SPaMi-FTS: An Efficient Algorithm for Mining Frequent Sequential Patterns

WS-Miner: A Fast Weighted Sequential Pattern Mining Algorithm

Mining Frequent Sequences Using Itemset-Based Extension

References

Agrawal, R., Srikant, R.: Fast Algorithm for Mining Association Rules. In: Proc. of International Conf. on Very Large Data Bases, pp. 487–499 (1994)
Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proc. of IEEE International Conf. on Data Engineering, pp. 3–14 (1995)
Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential Pattern Mining using A Bitmap Representation. In: Proc. of ACM Conf. on Knowledge Discovery and Data Mining (2002)
Bonfield, J.K., Staden, R.: ZTR: A New Format for DNA Sequence Trace Data. Bioinformatics 18(1), 3–10 (2002)
Article Google Scholar
Chiu, D.Y., Wu, Y.H., Chen, A.L.P.: An Efficient Algorithm for Mining Frequent Sequences by a New Strategy without Support Counting. In: Proc. of IEEE International Conf. on Data Engineering, pp. 375–386 (2004)
Cong, S., Han, J., Padua, D.: Parallel Mining of Closed Sequential Patterns. In: Proc. of ACM International Conf. on Knowledge Discovery in Data Mining, pp. 562–567 (2005)
Garofalakis, M.N., Rastogi, R., Shim, K.: Mining Sequential Patterns with Regular Expression Constraints. IEEE Trans. Knowl. Data Eng. 14(3), 530–552 (2002)
Article Google Scholar
Han, J.W., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., Hsu, M.C.: FreeSpan: Frequent Pattern-Projected Sequential Pattern Mining. In: Proc. of ACM International Conf. on Knowledge Discovery and Data Mining, pp. 355–359 (2000)
Ho, C.C., Li, H.F., Kuo, F.F., Lee, S.Y.: Incremental Mining of Sequential Patterns over a Stream Sliding Window. In: Proc. of IEEE International Conf. on Data Mining Workshops, pp. 677–681 (2006)
Hsu, J.L., Liu, C.C., Chen, A.L.P.: Discovering Nontrivial Repeating Patterns in Music Data. IEEE Trans. Multimed. 3(3), 311–325 (2001)
Article Google Scholar
Lesh, N., Zaki, M.J., Ogihara, M.: Mining Features for Sequence Classification. In: Proc. of ACM International Conf. on Knowledge Discovery and Data Mining, pp. 342–346 (1999)
Pei, J., Han, J.W., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.C.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proc. of IEEE International Conf. on Data Engineering, pp. 215–224 (2001)
Pei, J., Han, J.W., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.C.: Mining Sequential Patterns by Pattern Growth: The PrefixSpan Approach. IEEE Trans. Knowl. Data Eng. 16(11), 1424–1440 (2004)
Article Google Scholar
Pei, J., Han, J.W., Wang, W.: Mining Sequential Patterns with Constraints in Large Databases. In: Proc. of ACM Conf. on Information and Knowledge Management (2002)
Pinto, H., Han, J.W., Pei, J., Wang, K., Chen, Q., Dayal, U.: Multi-Dimensional Sequential Pattern Mining. In: Proc. of ACM International Conf. Information and Knowledge Management, pp. 81–88 (2001)
Raissi, C., Poncelet, P., Teisseire, M.: SPEED: Mining Maximal Sequential Patterns over Data Streams. In: Proc. of IEEE International Conf. on Intelligent Systems, pp. 546–552 (2006)
Rolland, P.Y.: FlExPat: Flexible Extraction of Sequential Patterns. In: Proc. of IEEE International Conf. on Data Mining, pp. 481–488 (2001)
She, C., Tang, J., Li, L., Wang, H., Fan, Z.: An Improved Parallel Algorithm for Sequence Mining. In: Proc. of the IEEE International Conf. on Mechatronics and Automation, pp. 1692–1696 (2005)
Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Proc. of International Conf. on Extending Database Technology (1996)
Weiss, M.A.: Data Structures and Algorithm Analysis in C, 2nd edn. Addison-Wesley, Reading (1997)
Google Scholar
Wesselink, J.J., Iglesia, B. et al.: Determining a Unique Defining DNA Sequence for Yeast Species Using Hashing Techniques. Bioinformatics 18(7), 1004–1010 (2002)
Article Google Scholar
Wu, Y.H., Chen, A.L.P.: Prediction of Web Page Accesses by Proxy Server Log. World Wide Web: Internet Web Inf. Syst. 5(1), 67–88 (2002)
Article MATH Google Scholar
Yang, J., Wang, W., Yu, P.S., Han, J.W.: Mining Long Sequential Patterns in a Noisy Environment. In: Proc. of ACM International Conf. on Management of Data (2002)
Zaki, M.J.: Efficient Enumeration of Frequent Sequences. In: Proc. of ACM International Conf. on Information and Knowledge Management, pp. 68–75 (1998)
Zaki, M.J.: SPADE: An Efficient Algorithm for Mining Frequent Sequences. Mach. Learn. 42(1), 31–60 (2001)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, ROC
Ding-Ying Chiu
Department of Information and Computer Engineering, Chung Yuan Christian University, Chung Li, Taiwan, ROC
Yi-Hung Wu
Department of Computer Science, National Chengchi University, Taipei, Taiwan, ROC
Arbee L. P. Chen

Authors

Ding-Ying Chiu
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Hung Wu
View author publications
You can also search for this author in PubMed Google Scholar
Arbee L. P. Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arbee L. P. Chen.

Additional information

This paper is a major-value added version of the following paper: D. Y. Chiu, Y. H. Wu, A. L. P. Chen, “An Efficient Algorithm for Mining Frequent Sequences by a New Strategy without Support Counting,” Proceedings of IEEE Data Engineering Conference, pp. 375–386, 2004.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chiu, DY., Wu, YH. & Chen, A.L.P. Efficient frequent sequence mining by a dynamic strategy switching algorithm. The VLDB Journal 18, 303–327 (2009). https://doi.org/10.1007/s00778-008-0100-7

Download citation

Received: 24 March 2008
Accepted: 24 March 2008
Published: 26 April 2008
Issue Date: January 2009
DOI: https://doi.org/10.1007/s00778-008-0100-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient frequent sequence mining by a dynamic strategy switching algorithm

Abstract

Access this article

Similar content being viewed by others

SPaMi-FTS: An Efficient Algorithm for Mining Frequent Sequential Patterns

WS-Miner: A Fast Weighted Sequential Pattern Mining Algorithm

Mining Frequent Sequences Using Itemset-Based Extension

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient frequent sequence mining by a dynamic strategy switching algorithm

Abstract

Access this article

Similar content being viewed by others

SPaMi-FTS: An Efficient Algorithm for Mining Frequent Sequential Patterns

WS-Miner: A Fast Weighted Sequential Pattern Mining Algorithm

Mining Frequent Sequences Using Itemset-Based Extension

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation