skip to main content
10.1145/1943628.1943673acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfitConference Proceedingsconference-collections
research-article

Improving the efficiency of FP tree construction using transactional patternbase

Published:21 December 2010Publication History

ABSTRACT

Mining frequent patterns in transaction databases has been a popular theme in data mining study. Common activities include finding patterns among the large set of data items in database transactions. The Apriori algorithm is a widely accepted method of generating frequent patterns. The algorithm requires many scans of the database and thus seriously tax resources. Some of the methods currently being used for improving the efficiency of the Apriori algorithm are hash-based itemset counting, transaction reduction, partitioning, sampling, dynamic itemset counting etc. Two main approaches for associations rule mining are: candidate set generation and test, and restricted test only. Both approaches use to scan massive database multiple times. In our study, we propose a transaction patternbase, constructed in first scan of database. Transactions with same pattern are added to the Patternbase as their frequency is increased. Thus subsequent scanning requires only scanning this compact dataset which increases efficiency of the respective methods. We have implemented this technique with FP Growth method. This technique outperforms the database approach in many situations and performs exceptionally well when the repetition of transaction patterns is higher. It can be used with any associations rule mining method.

References

  1. THE LUCS-KDD SOFTWARE LIBRARY (LIVERPOOL UNIVERSITY COMPUTER SCIENCE KNOWLEDGE DISCOVERY IN DATAS) http://www.csc.liv.ac.uk/~frans/KDD/Software/FPgrowth/pima.D38.N768.C2.numGoogle ScholarGoogle Scholar
  2. R. Agrawal, T. Imielinski, and A. N. Swami, Mining association rules between sets of items in large databases. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, volume 22(2) of SIGMOD Record, ACM Press, 1993. pp. 207--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Agrawal and R. Srikant, Fast algorithms for mining association rules. Proceedings 20th International Conference on Very Large Data Bases, Morgan Kaufmann, 1994. pp. 487--499. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Brin, R. Motwani, J. D. Ullman and S. Tsur, Dynamic itemset counting and implication rules for market basket data. Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, volume 26(2) of SIGMOD Record, ACM Press, 1997. pp 255--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Pei, J. Han, and R. Mao, CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery 2000, Dallas, TX, 2000. pp. 21--30.Google ScholarGoogle Scholar
  6. J. Han, J. Pei, and Y. Yin, Mining frequent patterns without candidate generation. Proceeding of 2000 ACM SIGMOD Int. Conf. Management of Data (SIGMOD'00), Dallas, TX, May 2000. pp. 1--12 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Han, J. Pei, Y. Yin, and R. Mao, Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery, 2003 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li, "Parallel algorithm for discovery of association rules" Data Mining and Knowledge Discovery, 343--374, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. J. Zaki and C.-J. Hsiao, CHARM: An efficient algorithm for closed itemset mining. In R. Grossman. Proceedings of the Second SIAM International Conference on Data Mining, 2002.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Improving the efficiency of FP tree construction using transactional patternbase

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      FIT '10: Proceedings of the 8th International Conference on Frontiers of Information Technology
      December 2010
      281 pages
      ISBN:9781450303422
      DOI:10.1145/1943628

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 December 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader