Abstract
This paper discusses the problem of long pattern generation in dense databases. In recent years, there has been an increase of interest in techniques for maximal pattern generation. We present a survey of this class of methods for long pattern generation which differ considerably from the level-wise approach of traditional methods. Many of these techniques are rooted in combinatorial tricks which can be applied only when the generation of frequent patterns is not forced to be level wise. We present an overview of the different kinds of methods which can be used in order to improve the counting and search space exploration methods for long patterns.
- R. C. Agarwal, C. C. Aggarwal, V. V. V. Prasad. A Tree Projection Algorithm for generation of frequent itemsets. Journal on Parallel and Distributed Computing, Vol. 61, No. 3, pp. 350-371, March 2001.]] Google ScholarDigital Library
- R. C. Agarwal, C. C. Aggarwal, V. V. V. Prasad. Depth First Generation of Long Patterns. Proceedings of the ACM SIGKDD Conference, 2000.]] Google ScholarDigital Library
- R. Agrawal, T. Imielinski, A. Swami. Mining Association Rules between Sets of Items in Very Large Databases. ACM SIGMOD Conference Proceedings, pages 207-216, 1993.]] Google ScholarDigital Library
- R. Agrawal, R. Srikant. Fast Algorithms for Mining Association Rules. VLDB Conference Proceedings, pages 487-499, 1994.]] Google ScholarDigital Library
- R. J. Bayardo. Efficiently Mining Long Patterns from Databases. ACM SIGMOD Conference Proceedings, pages 85-93, 1998.]] Google ScholarDigital Library
- R. J. Bayardo, R. Agrawal, D. Gunopulos. Constraint-Based Rule Mining in Large Dense Databases. ICDE Conference Proceedings, 1999.]]Google Scholar
- S. Brin, R. Motwani, J. D. Ullman, S. Tsur. Dynamic Itemset Counting and Implication Rules for Market Basket Data. ACM SIGMOD Conference Proceedings, 1997.]] Google ScholarDigital Library
- D. Burdick, M. Calimlim, J. Gehrke. MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases. Proceedings of the ICDE Conference, 2001.]] Google ScholarDigital Library
- B. Dunkel, N. Soparkar. Data Organization and Access for Efficient Data Mining. ICDE Conference Proceedings, pages 522-529, 1999.]] Google ScholarDigital Library
- D. Gunopulos, H. Mannila, S. Saluja. Discovering All Most Specific Sentences by Randomized Algorithms. ICDT Conference Proceedings, pages 215-229, 1997.]] Google ScholarDigital Library
- J. Han, J. Pei, Y. Yin. Mining Frequent Patterns without Candidate Generation. ACM SIGMOD Conference Proceedings, pages 1-12, 2000.]] Google ScholarDigital Library
- N. Pasquier, Y. Bastide, R. Taouil, L. Lakhal. Discovering Frequent Closed Itemsets for Association Rules. ICDT Conference Proceedings, 1999.]] Google ScholarDigital Library
- J. Pei, J. Han, R. Mao. CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets. DMKD, 2000.]]Google Scholar
- J. Han, J. Pei, B. Mortazavi, Q. Chen, U. Dayal, M.-C. Hsu. FreeSpan: Frequent Pattern-Projected Sequential Pattern Mining. Proceedings of the ACM KDD Conference, 2000.]] Google ScholarDigital Library
- J. Pei, J. Han, L. Lakshmanan. Mining Frequent Itemsets with Convertible Constraints. Proceedings of the ICDE Conference, 2001.]] Google ScholarDigital Library
- D. Lin, Z. M. Kedem. Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Itemset. EDBT Conference Proceedings, pages 105-119, 1998.]] Google ScholarDigital Library
- H. Mannila, H. Toivonen, A. I. Verkamo. Efficient algorithms for discovering association rules. AAAI Workshop on KDD, 1994.]]Google Scholar
- I. Rigoutsos, A. Floratos. Combinatorial Pattern Discovery in Biological Sequences. Bioinformatics, 14(1): pages 55-67, 1998.]]Google Scholar
- R. Rymon. Search Through Systematic Set Enumeration. International Conference on Principles of Knowledge Representation and Reasoning, 1992.]]Google Scholar
- A. Savasere, E. Omiecinski, S. B. Navathe. An Efficient Algorithm for Mining Association Rules in Large Databases. VLDB Conference Proceedings, pages 432-444, 1995.]] Google ScholarDigital Library
- P. Shenoy et al. Turbo-charging Vertical Mining of Large Databases. ACM SIGMOD Conference Proceedings, 2000.]] Google ScholarDigital Library
- H. Toivonen. Sampling Large Databases for Association Rules. VLDB Conference Proceedings, pages 134-145, 1996.]] Google ScholarDigital Library
- G. I. Webb. OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 3:45-83, 1996.]]Google Scholar
- M. J. Zaki. Scalable Algorithms for Association Rule Mining. IEEE TKDE Journal, 12(3), pp. 372-390, May/June 2000.]] Google ScholarDigital Library
- M. J. Zaki. Generating non-redundant association rules. Proceedings of the ACM SIGKDD Conference, 2000.]] Google ScholarDigital Library
- M. J. Zaki, C. Hsiao. CHARM: An Efficient Algorithm for Closed Association Rule Mininf. Technical Report, RPI, 1999.]]Google Scholar
- M. J. Zaki, S. Parthasarathy, M. Ogihara, W. Li. New Algorithms for Fast Discovery of Association Rules. KDD Conference Proceedings, pages 283-286, 1997.]]Google Scholar
Index Terms
- Towards long pattern generation in dense databases
Recommendations
Frequent pattern mining: current status and future directions
Frequent pattern mining has been a focused theme in data mining research for over a decade. Abundant literature has been dedicated to this research and tremendous progress has been made, ranging from efficient and scalable algorithms for frequent ...
Discovering association rules change from large databases
AICI'11: Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part IDiscovering association rules and association rules change (ARC) from existing large databases is an important problem. This paper presents an approach based on multi-hash chain structures to mine association rules change from large database with ...
A Scalable Algorithm for Constructing Frequent Pattern Tree
Frequent Pattern Tree (FP-Tree) is a compact data structure of representing frequent itemsets. The construction of FP-Tree is very important prior to frequent patterns mining. However, there have been too limited efforts specifically focused on ...
Comments