Abstract
In this paper, we discuss a method for finding top-N colossal frequent patterns. A colossal pattern we try to extract is a maximal pattern with top-N largest length. Since colossal patterns can be found in relatively lower areas of an itemset (concept) lattice, an efficient method with some effective pruning mechanisms is desired.
We design a depth-first branch-and-bound algorithm for finding colossal patterns with top-N length, where a notion of pattern graph plays an important role. A pattern graph is a compact representation of the class of frequent patterns with a designated length. A colossal pattern can be found as a clique in a pattern graph satisfying a certain condition. From this observation, we design an algorithm for finding our target patterns by examining cliques in a graph defined from the pattern graph. The algorithm is based on a depth-first branch-and-bound method for finding a maximum clique. It should be noted that as our search progresses, the graph we are concerned with is dynamically updated into a sparser one which makes our task of finding cliques much easier and the branch-and-bound pruning more powerful. To the best of our knowledge, it is the first algorithm tailored for the problem which can exactly identify top-N colossal patterns. In our experimentation, we compare our algorithm with famous maximal frequent itemset miners from the viewpoint of computational efficiency for a synthetic and a benchmark dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Han, J., Cheng, H., Xin, D., Yan, X.: Frequent Pattern Mining - Current Status and Future Directions. Data Mining and Knowledge Discovery 15(1), 55–86 (2007)
Bayardo Jr., R.J.: Efficiently Mining Long Patterns from Databases. In: Proc. of the 1998 ACM SIGMOD Int’l. Conf. on Management of Data, pp. 85–93 (1998)
Burdick, D., Calimlim, M., Flannick, J., Gehrke, J., Yiu, T.: MAFIA: A Maximal Frequent Itemset Algorithm. IEEE Transactions on Knowledge and Data Engineering 17(11), 1490–1504 (2005)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient Mining of Association Rules Using Closed Itemset Lattices. Information Systems 24(1), 25–46 (1999)
Wang, J., Han, J., Pei, J.: CLOSET+: Searching for the Best Strategies for Mining Frequent Closed Itemsets. In: Proc. of the 9th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining - KDD 2003, pp. 236–245 (2003)
Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: Efficient Mining Algorithm for Frequent/Closed/Maximal Itemsets. In: Proc. of IEEE ICDM 2004 Workshop - FIMI 2004 (2004), http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS//Vol-126/
Flouvat, F., Marchi, F.D., Petit, J.: ABS: Adaptive Borders Search of Frequent Itemsets. In: Proc. of IEEE ICDM 2004 Workshop - FIMI 2004 (2004), http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS//Vol-126/
Omiecinski, E.R.: Alternative Interest Measures for Mining Associations in Databases. IEEE Transactions on Knowledge and Data Engineering 15(1), 57–69 (2003)
Szathmary, L., Napoli, A., Valtchev, P.: Towards Rare Itemset Mining. In: Proc. of the 19th IEEE Int’l Conf. on Tools with Artificial Intelligence - ICTAI 2007, pp. 305–312 (2007)
Troiano, L., Scibelli, G., Birtolo, C.: A Fast Algorithm for Mining Rare Itemsets. In: Proc. of the 2009 9th Int’l Conf. on Intelligent Systems Design and Applications – ISDA 2009, pp. 1149–1155 (2009)
Ganter, B., Wille, R.: Formal Concept Analysis - Mathematical Foundations, p. 284. Springer (1999)
Zhu, F., Yan, X., Han, J., Yu, P.S., Cheng, H.: Mining Colossal Frequent Patterns by Core Pattern Fusion. In: Proc. of the 23rd IEEE Int’l Conf. on Data Engineering - ICDE 2007, pp. 706–715 (2007)
Xie, Y., Yu, P.S.: Max-Clique: A Top-Down Graph-Based Approach to Frequent Pattern Mining. In: Proc. of the 2010 IEEE Int’l Conf. on Data Mining - ICDM 2010, pp. 1139–1144 (2010)
Tomita, E., Akutsu, T., Matsunaga, T.: Efficient Algorithms for Finding Maximum and Maximal Cliques: Effective Tools for Bioinformatics. In: Laskovski, A.N. (ed.) Biomedical Engineering, Trends in Electronics, Communications and Software, pp. 625–640. InTech (2011)
Tomita, E., Kameda, T.: An Efficient Branch-and-Bound Algorithm for Finding a Maximum Clique with Computational Experiments. Journal of Global Optimization 37(1), 95–111 (2007)
Balas, E., Yu, C.S.: Finding a Maximum Clique in an Arbitrary Graph. SIAM Journal on Computing 15(4), 1054–1068 (1986)
Eppstein, D., Strash, D.: Listing All Maximal Cliques in Large Sparse Real-World Graphs. In: Pardalos, P.M., Rebennack, S. (eds.) SEA 2011. LNCS, vol. 6630, pp. 364–375. Springer, Heidelberg (2011)
Okubo, Y., Haraguchi, M., Nakajima, T.: Finding Rare Patterns with Weak Correlation Constraint. In: Proceedings of the 2010 IEEE International Conference on Data Mining Workshops - ICDMW 2010, pp. 822–829 (2010)
Okubo, Y., Haraguchi, M.: An Algorithm for Extracting Rare Concepts with Concise Intents. In: Kwuida, L., Sertkaya, B. (eds.) ICFCA 2010. LNCS(LNAI), vol. 5986, pp. 145–160. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Okubo, Y., Haraguchi, M. (2012). Finding Top-N Colossal Patterns Based on Clique Search with Dynamic Update of Graph. In: Domenach, F., Ignatov, D.I., Poelmans, J. (eds) Formal Concept Analysis. ICFCA 2012. Lecture Notes in Computer Science(), vol 7278. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29892-9_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-29892-9_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29891-2
Online ISBN: 978-3-642-29892-9
eBook Packages: Computer ScienceComputer Science (R0)