Skip to main content

Abstract

Data-mining is the extraction of meaningful patterns from the large source of data. Association Rule Mining (ARM) is an important data mining technique. Mining of frequent patterns is a very important association rule mining problem. The previous approach i.e. Apriori suffers from the candidate-generation and test mechanism. The Apriori approach becomes inefficient when either the length of the frequent set or length of the Transaction Database (TDB) increases. The algorithm adopts bottom up breadth first approach for the mining purpose. In this research work, we have proposed a Reduced Scanning Transaction Database (RSTDB) algorithm that uses certain heuristic function which reduces the number of Transaction Database passes required to generate the maximum frequent set required for Association Rule Mining (ARM). The approach is a hybrid of bottom up and top down approach. It uses both upward and downward closure properties for frequent item sets evaluation.

In this work we will compare the Apriori approach with the above proposed approach for frequent pattern mining. We will try to evaluate the shortcoming of the proposed approach and also look as to how much efficient it is and in which cases. The RSTDB algorithm not only reduces the database scans but also will help in reducing the number of candidategeneration for a phase that is having a value less than the minimum support threshold value.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., and Swami, A.N.: Mining association rules between sets of items in large databases. Proceedings of ACM SIGMOD International Conference on Management of Data, ACM Press, Washington DC, pp. 207–216, May (1993)

    Google Scholar 

  2. Zaki, M.J.: Scalable Algorithms for Association Mining. IEEE Transactions on Knowledge and Data Engineering, vol. 12, no. 3, pp. 372–390, May/June (2000)

    Article  MathSciNet  Google Scholar 

  3. Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. Proceedings of ACM SIGMOD Internationa 1 Conference on Management of Data, ACM Press, Dallas, Texas, pp. 1–12, May (2000)

    Google Scholar 

  4. Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., and Yang, D.: Hmine: Hyper-Structure Mining of Frequent Patterns in Large Databases. Proceedings of IEEE International Conference on Data Mining, pp. 441–448 (2001)

    Google Scholar 

  5. Pietracaprina, Zandolin, D.: Mining Frequen t Item sets Using Patricia Tries. FIMI’ 03, Frequent Itemset Mining Implementations, Proceedings of the ICDM 2003 Workshop on Frequent Item set Mining Implementations, Melbourne, Florida, Dec. (2003)

    Google Scholar 

  6. Grahne, G., Zhu, J.: Efficiently using prefix-trees in mining frequent itemsets. FIMI’ 03, Frequent Itemset Mining Implementations, Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations, Melbourne, Florida, December (2003)

    Google Scholar 

  7. Burdick, D., Calimlim, M., Flannick, J., Gehrke, J.: MAFIA: A Maximal Frequent Itemset Algorithm. IEEE Transactions on Knowledge and Data Engineering, 17, 1490–1505, Nov. (2005).

    Article  Google Scholar 

  8. Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, AAAI Press, pp. 283–286 (1997)

    Google Scholar 

  9. Shenoy, P., Haritsa, J.R., Sudarshan, S., Bhalotia, G., Bawa, M., Shah, D.: Turbo-charging vertical mining of large databases. Proceedings of ACM SIGMOD Intnational Conference on Management of Data, ACM Press, Dallas, Texas, pp. 22–23, May (2000)

    Google Scholar 

  10. Burdick, D., Calimlim, M., and Gehrke, J.: MAFIA: a maximal frequent item set algorithm for transactional databases. Proceedings of International Conference on Data Engineering, Heidelberg, Germany, pp. 443–452, April (2001)

    Google Scholar 

  11. Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. Proceedings of the Nineth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D.C., ACM Press, New York, pp. 326–335, (2003)

    Google Scholar 

  12. Agrawal, R., Agarwal, C., Prasad, V.: A Tree Projection Algorithm for Generation of Frequent Item Sets. Parallel and Distributed Computing, pp. 350–371, (2000)

    Google Scholar 

  13. Singh, V.K., Shah, V., Jain Y.K., Shukla, A., Thoke, A.S., Singh, V.K., Dule, C., Parganiha, V.: Proposing an Efficient Method for Frequent Pattern Mining. has been Accepted for Oral Presentation at the Conference and publication in Proceeding of World Academy of Science, Engineering and Technology, Volume 36, International Conference on Computational and Statistical Sciences, Bangkok Dec 9 (2008)

    Google Scholar 

  14. Singh, V.K., Shah V.: Minimizing Space Time Complexity in Frequent Pattern Mining by Reducing Transaction Database Scanning and Using Pattern Growth Methods. To appear in Chhattisgarh Journal, of Science and Technology (2008)

    Google Scholar 

  15. Singh, V.K., Shah V.: Minimizing Space Time Tradeoff in Frequent Pattern Mining Using Pattern growth Methods. Proceedings of Tech Acme 08, 17–19 Oct Bhopal (2008)

    Google Scholar 

  16. Singh, V.K., Singh, V.K.: The Huge Potential of Information Technology. Proceedings of National Convention on Global Leadership: Strategies and Challenges for Indian Business, Feb 10–11, GGDU Bilaspur (2007).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Indian Institute of Information Technology, India

About this paper

Cite this paper

Singh, V.K., Singh, V.K. (2009). Minimizing Space Time Complexity by RSTDB a New Method for Frequent Pattern Mining. In: Tiwary, U.S., Siddiqui, T.J., Radhakrishna, M., Tiwari, M.D. (eds) Proceedings of the First International Conference on Intelligent Human Computer Interaction. Springer, New Delhi. https://doi.org/10.1007/978-81-8489-203-1_35

Download citation

  • DOI: https://doi.org/10.1007/978-81-8489-203-1_35

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-8489-404-2

  • Online ISBN: 978-81-8489-203-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics