Abstract
In this paper we present extensions for continuous pattern mining. Our previous continuous pattern mining algorithm mines the set of all frequent sequences satisfying the minSup condition. However, those sequences contain an explosive number of frequent subsequences, which makes the analysis and understanding of patterns very difficult. In order to overcome these difficulties, we propose four new algorithms for mining maximal and closed continuous patterns. These algorithms return a superset of the result patterns and then a post-pruning algorithm is performed to eliminate redundant sequences. For each type of patterns (maximal or closed) two algorithms are presented (with and without some improvements). The key idea is to omit as many redundant sequences as possible during the exploration. The proposed algorithms allow one to reduce the size of the result set when input sequences have low uniqueness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proc. of the 17th Int. Conf. on Data Engineering, pp. 215–224. IEEE CS, Heidelberg (2001)
Zaki, M.J.: SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning 42, 31–60 (2001)
Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential PAttern mining using a bitmap representation. In: Proc. of the 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 429–435. ACM, Edmonton (2002)
Gouda, K., Zaki, M.J.: Efficiently Mining Maximal Frequent Itemsets. In: Proc. of the 2001 IEEE Int. Conf. on Data Mining, pp. 163–170. IEEE CS, San Jose (2001)
Grahne, G., Zhu, J.: High performance mining of maximal frequent itemsets. In: Proc. of the Sixth SIAM Int. Workshop on High Performance Data Mining, pp. 135–143 (2003)
Pei, J., Han, J., Mao, R.: CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21–30 (2000)
Zaki, M.J., Hsiao, C.-J.: CHARM: An Efficient Algorithm for Closed Itemset Mining. In: Proc. of the Second SIAM Int. Conf. on Data Mining. SIAM, Arlington (2002)
Yan, X., Han, J., Afshar, R.: CloSpan: Mining Closed Sequential Patterns in Large Databases. In: Proc. of the Third SIAM Int. Conf. on Data Mining. SIAM, San Francisco (2003)
Wang, J., Han, J.: BIDE: Efficient Mining of Frequent Closed Sequences. In: Proc. of the 20th Int. Conf. on Data Engineering, pp. 79–90. IEEE CS, Boston (2004)
Pei, J., Han, J., Mortazavi-Asl, B., Zhu, H.: Mining Access Patterns Efficiently from Web Logs. In: Terano, T., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 396–407. Springer, Heidelberg (2000)
Tseng, V.S., Lin, K.W.: Efficient mining and prediction of user behavior patterns in mobile web systems. Information and Software Technology 48, 357–369 (2006)
Gorawski, M., Jureczek, P., Gorawski, M.: Exploration of continuous sequential patterns using the CPGrowth algorithm. In: The 7-th Int. Conf. on Multimedia and Network Information Systems, pp. 165–172 (2010)
Spiliopoulou, M., Faulstich, L.C.: WUM: A Tool for Web Utilization Analysis. In: Atzeni, P., Mendelzon, A.O., Mecca, G. (eds.) WebDB 1998. LNCS, vol. 1590, pp. 184–203. Springer, Heidelberg (1999)
Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proc. of the 2000 ACM SIGMOD Int. Conf. on Management of Data, pp. 1–12, Dallas (2000)
Brinkhoff, T.A.: A Framework for Generating Network-Based Moving Objects. Geoinformatica, 153–180 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gorawski, M., Jureczek, P. (2011). Extensions for Continuous Pattern Mining. In: Yin, H., Wang, W., Rayward-Smith, V. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2011. IDEAL 2011. Lecture Notes in Computer Science, vol 6936. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23878-9_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-23878-9_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23877-2
Online ISBN: 978-3-642-23878-9
eBook Packages: Computer ScienceComputer Science (R0)