Mining Frequent Sequences Using Itemset-Based Extension

Zhixin, Ma; Yusheng, Xu; Dillon, Tharam S.

doi:10.1007/978-3-319-04048-6_1

Ma Zhixin^29,30,
Xu Yusheng^29,30 &
Tharam S. Dillon³¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8178))

Included in the following conference series:

1151 Accesses

Abstract

In this paper, we systematically explore an itemset-based extension approach for generating candidate sequence which contributes to a better and more straightforward search space traversal performance than traditional item-based extension approach. Based on this candidate generation approach, we present FINDER, a novel algorithm for discovering the set of all frequent sequences. FINDER is composed of two separated steps. In the first step, all frequent itemsets are discovered and we can get great benefit from existing efficient itemset mining algorithms. In the second step, all frequent sequences with at least two frequent itemsets are detected by combining depth-first search and itemset-based extension candidate generation together. A vertical bitmap data representation is adopted for rapidly support counting reason. Several pruning strategies are used to reduce the search space and minimize cost of computation. An extensive set of experiments demonstrate the effectiveness and the linear scalability of proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Study of Effective Mining Algorithms for Frequent Itemsets

Efficient Infrequent Itemset Mining Using Depth-First and Top-Down Lattice Traversal

A Comparative Analysis of Breadth First Search Approach in Mining Frequent Itemsets

References

Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proc. of 11th Int’l Conf. on Data Engineering, pp. 3–14 (March 1995)
Google Scholar
Ayres, J., Gehrke, J., Yiu, T., Flannick, J.: Sequential Pattern Mining Using a Bitmap Representation. In: Proc. of ACM SIGKDD Conf. on Knowledge Discovery and Data Mining, pp. 429–435 (2002)
Google Scholar
Burdick, D., Calimlim, M., Gehrke, J.: MAFIA: A maximal frequent itemset algorithm for transactional databases. In: Proc. of 17th Int’l Conf. on Data Engineering, pp. 443–452 (2001)
Google Scholar
Cao, L.: In-depth Behavior Understanding and Use: the Behavior Informatics Approach. Information Science 180(17), 3067–3085 (2010)
Article Google Scholar
Cao, L., Yu, P. (eds.): Behavior Computing: Modeling, Analysis, Mining and Decision. Springer (2012)
Google Scholar
Pei, J., Han, J., Mortazavi-Asi, B., Wang, J., Pinto, H., Chen, Q.: Mining Sequential Patterns by Pattern-growth: The PrefixSpan Approach. IEEE Transactions on Knowlede and Data Engineering 16, 1–17 (2004)
Article Google Scholar
Rymon, R.: Search through systematic set enumeration. In: Proc. of 3rd Int’l Conf. on Principles of Knowledge Representation and Reasoning, pp. 539–550 (1992)
Google Scholar
Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Proc. of 15th Int’l Conf. on Extending Database Technology, pp. 3–17 (1996)
Google Scholar
Zaki, M.J.: SPADE: An Efficient Algorithms for Mining Frequent Sequences. Machine Learning 40, 31–60 (2001)
Article Google Scholar
Tan, H., Dillon, T., Hadzic, F., Chang, E.: SEQUEST: Mining frequent subsequences using DMA Strips. In: Proc. of Data Mining & Information Engineering 2006 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Padova, Italy
Ma Zhixin & Xu Yusheng
School of Information Science and Technology, Lanzhou University, Lanzhou, China
Ma Zhixin & Xu Yusheng
School of Information System, Curtin University, Perth, Australia
Tharam S. Dillon

Authors

Ma Zhixin
View author publications
You can also search for this author in PubMed Google Scholar
Xu Yusheng
View author publications
You can also search for this author in PubMed Google Scholar
Tharam S. Dillon
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Advanced Analytics Institute, University of Technology, 2-12 Blackfriars Street, Chippendale, Blackfriars Campus, NSW 2008, Sydney, Australia
Longbing Cao
Institute of Scientific and Industrial Research, Osaka University, Japan
Hiroshi Motoda
Department of Computer Science, University of Minnesota, USA
Jaideep Srivastava
School of Information Systems, Singapore Management University, 80 Stamford Road, 178902, Singapore
Ee-Peng Lim
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Irwin King
Department of Computer Science, University of Illinois at Chicago, 851 S. Morgan St., Rm 1138 SEO, 60607, Chicago, IL, USA
Philip S. Yu
Leibniz Universität Hannover, Germany
Wolfgang Nejdl
Advanced Analytics Institute, University of Technology, Sydney, Australia
Guandong Xu
Deakin University, Burwood, VIC, Australia
Gang Li
Shanghai Jiao Tong University, 200240, Shanghai, China
Ya Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhixin, M., Yusheng, X., Dillon, T.S. (2013). Mining Frequent Sequences Using Itemset-Based Extension. In: Cao, L., et al. Behavior and Social Computing. BSIC BSI 2013 2013. Lecture Notes in Computer Science(), vol 8178. Springer, Cham. https://doi.org/10.1007/978-3-319-04048-6_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-04048-6_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04047-9
Online ISBN: 978-3-319-04048-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics