Sequential Patterns

Wang, Jianyong

doi:10.1007/978-1-4614-8265-9_343

Jianyong Wang³

12 Accesses

Synonyms

Frequent subsequences

Definition

A sequence database D = {S₁, S₂,…,S_n} for sequential pattern mining consists of n input sequences (where n ≥ 1), and an input sequence S_i = 〈e_i1, e_i2, … , e_im〉(1 ≤ i ≤ n) is an ordered list of m events (where m ≥1). Each event\( {e}_{i_j}\left(1\le i\le n,1\le j\le m\right) \) is a non-empty set of items. Given two sequences, S_a = 〈e_a1, e_a2, … , e_ak〉 and S_b = 〈e_b1, e_b2, … , e_bl〉, if k ≤ l and there exist integers 1≤x₁<x₂< … < x_k ≤l such that \( {e}_{a1}\subseteq {e}_{b_{x1}},{e}_{a2}\subseteq {e}_{b_{x2}},\ldots,{e}_{ak}\subseteq {e}_{b{{}_x}_k},{S}_b \) is said to contain S_a (or equivalently, S_a is said to be contained in S_b). The number of input sequences in D that contain sequence S is called the support of S in D, denoted by sup^D (S). Given a user-specified minimum support threshold min_sup, S is called a sequential pattern (or a frequent subsequence) in D if sup^D (S)≥min_sup. If there exists no proper supersequence of a sequential pattern S...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 4,499.99; Price excludes VAT (USA)

Hardcover Book: USD 6,499.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

Agrawal R, Srikant R. Mining sequential patterns. In: Proceedings of the 11th International Conference on Data Engineering; 1995.
Google Scholar
Aggarwal CC, Ta N, Wang J, Feng J, Zaki MJ. XProj: a framework for projected structural clustering of XML documents. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2007.
Google Scholar
Ayres J, Gehrke J, Yiu T, Flannick J. Sequential pattern mining using a bitmap representation. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2002.
Google Scholar
Han J, Pei J, Mortazavi-Asl B, Chen Q, Dayal U, Hsu MC. FreeSpan: frequent pattern-projected sequential pattern mining. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2000.
Google Scholar
Li Z, Chen Z, Srinivasan S, Zhou Y. C-Miner: mining block correlations in storage systems. In: Proceedings of the 3rd USENIX Conference of on File and Storage Technologies; 2004.
Google Scholar
Li Z, Lu S, Myagmar S, Zhou Y. CP-Miner: finding copy-paste and related bugs in large-scale software code. IEEE Trans Softw Eng. 2006;32(3):176–92.
Article Google Scholar
Lo D, Khoo SC SMArTIC: towards building an accurate, robust and scalable specification miner. In: Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering; 2006.
Google Scholar
Pei J, Han J, Mortazavi-Asl B, Pinto H, Chen Q, Dayal U, Hsu MC. PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern-growth. In: Proceedings of the 17th International Conference on Data Engineering; 2001.
Google Scholar
She R, Chen F, Wang K, Ester M, Gardy JL, Brinkman FSL. Frequent-subsequence-based prediction of outer membrane proteins. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2003.
Google Scholar
Srikant R, Agrawal R Mining sequential patterns: generalizations and performance improvements. In: Advances in Database Technology, Proceedings of the 5th International Conference on Extending Database Technology; 1996.
Google Scholar
Sun G, Liu X, Cong G, Zhou M, Xiong Z, Lee J, Lin CY. Detecting erroreous sentences using automatically mined sequential patterns. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics; 2007.
Google Scholar
Wang J, Han J, Li C. Frequent closed sequence mining without candidate maintenance. IEEE Trans Knowl Data Eng. 2007;19(8):1042–56.
Article Google Scholar
Xie T, Pei J. Data mining for software engineering. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2006.
Google Scholar
Yan X, Han J, Afshar R CloSpan: mining closed sequential patterns in large databases. In: Proceedings of the 2003 SIAM International Conference on Data Mining; 2003.
Chapter Google Scholar
Zaki MJ. SPADE: an efficient algorithm for mining frequent sequences. Mach Learn. 2001;42(1/2):31–60.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Tsinghua University, Beijing, China
Jianyong Wang

Authors

Jianyong Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianyong Wang .

Editor information

Editors and Affiliations

Georgia Institute of Technology College of Computing, Atlanta, GA, USA
Ling Liu
University of Waterloo School of Computer Science, Waterloo, ON, Canada
M. Tamer Özsu

Section Editor information

School of Computing Science, Simon Fraser Univ., Burnaby, British Columbia, Canada
Jian Pei

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Wang, J. (2018). Sequential Patterns. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_343

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8265-9_343
Published: 07 December 2018
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics