Abstract
Data-mining is the extraction of meaningful patterns from the large source of data. Association Rule Mining (ARM) is an important data mining technique. Mining of frequent patterns is a very important association rule mining problem. The previous approach i.e. Apriori suffers from the candidate-generation and test mechanism. The Apriori approach becomes inefficient when either the length of the frequent set or length of the Transaction Database (TDB) increases. The algorithm adopts bottom up breadth first approach for the mining purpose. In this research work, we have proposed a Reduced Scanning Transaction Database (RSTDB) algorithm that uses certain heuristic function which reduces the number of Transaction Database passes required to generate the maximum frequent set required for Association Rule Mining (ARM). The approach is a hybrid of bottom up and top down approach. It uses both upward and downward closure properties for frequent item sets evaluation.
In this work we will compare the Apriori approach with the above proposed approach for frequent pattern mining. We will try to evaluate the shortcoming of the proposed approach and also look as to how much efficient it is and in which cases. The RSTDB algorithm not only reduces the database scans but also will help in reducing the number of candidategeneration for a phase that is having a value less than the minimum support threshold value.
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., and Swami, A.N.: Mining association rules between sets of items in large databases. Proceedings of ACM SIGMOD International Conference on Management of Data, ACM Press, Washington DC, pp. 207–216, May (1993)
Zaki, M.J.: Scalable Algorithms for Association Mining. IEEE Transactions on Knowledge and Data Engineering, vol. 12, no. 3, pp. 372–390, May/June (2000)
Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. Proceedings of ACM SIGMOD Internationa 1 Conference on Management of Data, ACM Press, Dallas, Texas, pp. 1–12, May (2000)
Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., and Yang, D.: Hmine: Hyper-Structure Mining of Frequent Patterns in Large Databases. Proceedings of IEEE International Conference on Data Mining, pp. 441–448 (2001)
Pietracaprina, Zandolin, D.: Mining Frequen t Item sets Using Patricia Tries. FIMI’ 03, Frequent Itemset Mining Implementations, Proceedings of the ICDM 2003 Workshop on Frequent Item set Mining Implementations, Melbourne, Florida, Dec. (2003)
Grahne, G., Zhu, J.: Efficiently using prefix-trees in mining frequent itemsets. FIMI’ 03, Frequent Itemset Mining Implementations, Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations, Melbourne, Florida, December (2003)
Burdick, D., Calimlim, M., Flannick, J., Gehrke, J.: MAFIA: A Maximal Frequent Itemset Algorithm. IEEE Transactions on Knowledge and Data Engineering, 17, 1490–1505, Nov. (2005).
Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, AAAI Press, pp. 283–286 (1997)
Shenoy, P., Haritsa, J.R., Sudarshan, S., Bhalotia, G., Bawa, M., Shah, D.: Turbo-charging vertical mining of large databases. Proceedings of ACM SIGMOD Intnational Conference on Management of Data, ACM Press, Dallas, Texas, pp. 22–23, May (2000)
Burdick, D., Calimlim, M., and Gehrke, J.: MAFIA: a maximal frequent item set algorithm for transactional databases. Proceedings of International Conference on Data Engineering, Heidelberg, Germany, pp. 443–452, April (2001)
Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. Proceedings of the Nineth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D.C., ACM Press, New York, pp. 326–335, (2003)
Agrawal, R., Agarwal, C., Prasad, V.: A Tree Projection Algorithm for Generation of Frequent Item Sets. Parallel and Distributed Computing, pp. 350–371, (2000)
Singh, V.K., Shah, V., Jain Y.K., Shukla, A., Thoke, A.S., Singh, V.K., Dule, C., Parganiha, V.: Proposing an Efficient Method for Frequent Pattern Mining. has been Accepted for Oral Presentation at the Conference and publication in Proceeding of World Academy of Science, Engineering and Technology, Volume 36, International Conference on Computational and Statistical Sciences, Bangkok Dec 9 (2008)
Singh, V.K., Shah V.: Minimizing Space Time Complexity in Frequent Pattern Mining by Reducing Transaction Database Scanning and Using Pattern Growth Methods. To appear in Chhattisgarh Journal, of Science and Technology (2008)
Singh, V.K., Shah V.: Minimizing Space Time Tradeoff in Frequent Pattern Mining Using Pattern growth Methods. Proceedings of Tech Acme 08, 17–19 Oct Bhopal (2008)
Singh, V.K., Singh, V.K.: The Huge Potential of Information Technology. Proceedings of National Convention on Global Leadership: Strategies and Challenges for Indian Business, Feb 10–11, GGDU Bilaspur (2007).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Indian Institute of Information Technology, India
About this paper
Cite this paper
Singh, V.K., Singh, V.K. (2009). Minimizing Space Time Complexity by RSTDB a New Method for Frequent Pattern Mining. In: Tiwary, U.S., Siddiqui, T.J., Radhakrishna, M., Tiwari, M.D. (eds) Proceedings of the First International Conference on Intelligent Human Computer Interaction. Springer, New Delhi. https://doi.org/10.1007/978-81-8489-203-1_35
Download citation
DOI: https://doi.org/10.1007/978-81-8489-203-1_35
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-8489-404-2
Online ISBN: 978-81-8489-203-1
eBook Packages: Computer ScienceComputer Science (R0)