Skip to main content

GraSeq: A Novel Approximate Mining Approach of Sequential Patterns over Data Stream

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4632))

Abstract

Sequential patterns mining is an important data mining approach with broad applications. Traditional mining algorithms on database were not adapted to data stream. Recently, some approximate sequential pattern mining algorithms over data stream were presented which solved some problems except the one of wasting too many system resources in processing long sequences. According to observation and proof, a novel approximate sequential pattern mining algorithm is proposed named GraSeq. GraSeq uses directed weighted graph structure and stores the synopsis of sequences with only one scan of data stream; furthermore, a subsequences matching method is mentioned to reduce the cost of long sequences’ processing and a conception validnode is introduced to improve the accuracy of mining results. Our experimental results demonstrate that this algorithm is effective and efficient.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Srikant, R., Agrawal, R.: Mining sequential patterns: Generalizations and performance improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 3–17. Springer, Heidelberg (1996)

    Google Scholar 

  2. Zaki, M.: SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning 40, 31–60 (2001)

    Article  Google Scholar 

  3. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.-C.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: ICDE 2001. Proceeding of the International Conference on Data Engineering, pp. 215–224 (2001)

    Google Scholar 

  4. Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., Hsu, M.-C.: FreeSpan: Frequent Pattern-Projected Sequential Pattern Mining. In: KDD ’00. Proceeding of ACM SIGKDD International Conference Knowledge Discovery in Databases, August 2000, pp. 355–359 (2000)

    Google Scholar 

  5. Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential pattern mining using a bitmap representation. In: KDD 2002. Proceeding of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 2002, pp. 429–235 (2002)

    Google Scholar 

  6. Kum, H.C., Pei, J., Wang, W., Duncan, D.: Approx-MAP: Approximate Mining of Consensus Sequential Patterns. Technical Report TR02-031, UNC-CH (2002)

    Google Scholar 

  7. Chang, J.H., Lee, W.S.: Efficient Mining method for Retrieving Sequential Patterns over Online Data Streams. Journal of Information Science, 31–36 (2005)

    Google Scholar 

  8. Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: ICDE 1995. Proceedings of the 11th International Conference on Data Engineering, March 1995, pp. 3–14 (1995)

    Google Scholar 

  9. Chang, J.H., Lee, W.S.: Finding recent frequent itemsets adaptively over online data streams. In: Getoor, L., et al. (eds.) Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2003, pp. 487–492 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Li, H., Chen, H. (2007). GraSeq: A Novel Approximate Mining Approach of Sequential Patterns over Data Stream. In: Alhajj, R., Gao, H., Li, J., Li, X., Zaïane, O.R. (eds) Advanced Data Mining and Applications. ADMA 2007. Lecture Notes in Computer Science(), vol 4632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73871-8_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73871-8_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73870-1

  • Online ISBN: 978-3-540-73871-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics