Abstract
In this paper, we systematically explore an itemset-based extension approach for generating candidate sequence which contributes to a better and more straightforward search space traversal performance than traditional item-based extension approach. Based on this candidate generation approach, we present FINDER, a novel algorithm for discovering the set of all frequent sequences. FINDER is composed of two separated steps. In the first step, all frequent itemsets are discovered and we can get great benefit from existing efficient itemset mining algorithms. In the second step, all frequent sequences with at least two frequent itemsets are detected by combining depth-first search and itemset-based extension candidate generation together. A vertical bitmap data representation is adopted for rapidly support counting reason. Several pruning strategies are used to reduce the search space and minimize cost of computation. An extensive set of experiments demonstrate the effectiveness and the linear scalability of proposed algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proc. of 11th Int’l Conf. on Data Engineering, pp. 3–14 (March 1995)
Ayres, J., Gehrke, J., Yiu, T., Flannick, J.: Sequential Pattern Mining Using a Bitmap Representation. In: Proc. of ACM SIGKDD Conf. on Knowledge Discovery and Data Mining, pp. 429–435 (2002)
Burdick, D., Calimlim, M., Gehrke, J.: MAFIA: A maximal frequent itemset algorithm for transactional databases. In: Proc. of 17th Int’l Conf. on Data Engineering, pp. 443–452 (2001)
Cao, L.: In-depth Behavior Understanding and Use: the Behavior Informatics Approach. Information Science 180(17), 3067–3085 (2010)
Cao, L., Yu, P. (eds.): Behavior Computing: Modeling, Analysis, Mining and Decision. Springer (2012)
Pei, J., Han, J., Mortazavi-Asi, B., Wang, J., Pinto, H., Chen, Q.: Mining Sequential Patterns by Pattern-growth: The PrefixSpan Approach. IEEE Transactions on Knowlede and Data Engineering 16, 1–17 (2004)
Rymon, R.: Search through systematic set enumeration. In: Proc. of 3rd Int’l Conf. on Principles of Knowledge Representation and Reasoning, pp. 539–550 (1992)
Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Proc. of 15th Int’l Conf. on Extending Database Technology, pp. 3–17 (1996)
Zaki, M.J.: SPADE: An Efficient Algorithms for Mining Frequent Sequences. Machine Learning 40, 31–60 (2001)
Tan, H., Dillon, T., Hadzic, F., Chang, E.: SEQUEST: Mining frequent subsequences using DMA Strips. In: Proc. of Data Mining & Information Engineering 2006 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhixin, M., Yusheng, X., Dillon, T.S. (2013). Mining Frequent Sequences Using Itemset-Based Extension. In: Cao, L., et al. Behavior and Social Computing. BSIC BSI 2013 2013. Lecture Notes in Computer Science(), vol 8178. Springer, Cham. https://doi.org/10.1007/978-3-319-04048-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-04048-6_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04047-9
Online ISBN: 978-3-319-04048-6
eBook Packages: Computer ScienceComputer Science (R0)