Skip to main content

A Fast Algorithm for Frequent Itemset Mining Using Patricia* Structures

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7448))

Abstract

Efficient mining of frequent itemsets from a database plays an essential role in many data mining tasks such as association rule mining. Many algorithms use a prefix-tree to represent a database and mine frequent itemsets by constructing recursively conditional prefix-trees from the prefix-tree. A (conditional) prefix-tree can be stored in various structures. The construction and traversal costs of prefix-trees, or rather their storage structures, take a large proportion in the whole cost for such algorithms. The PatriciaMine algorithm employs a Patricia trie to store a prefix-tree and shows good performance. In this study, we introduce an efficient Patricia* structure for storing a prefix-tree. A Patricia* structure is more compact and contiguous than a corresponding Patricia trie, and thus the construction and traversal costs of the former are less than those of the latter. Previous prefix-tree-based algorithms adopt a similar mining procedure, in which most nodes in a prefix-tree are repeatedly accessed when the prefix-tree is processed. The paper presents a novel mining procedure in which node accesses for a prefix-tree are greatly reduced. We propose the PatriciaMine* algorithm that is the combination of the Patricia* structure with the proposed procedure. Experimental data show that PatriciaMine* outperforms not only PatriciaMine but also several fast algorithms, such as FPgrowth* and dEclat, for various databases.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imieliński, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. ACM SIGMOD, pp. 207–216 (1993)

    Google Scholar 

  2. Calders, T., Garboni, C., Goethals, B.: Approximation of Frequentness Probability of Itemsets in Uncertain Data. In: Proc. IEEE ICDM, pp. 749–754 (2010)

    Google Scholar 

  3. Ceglar, A., Roddick, J.F.: Association Mining. ACM Comput. Surv. 38(2), 1–42 (2006)

    Article  Google Scholar 

  4. Grahne, G., Zhu, J.: Fast Algorithms for Frequent Itemset Mining Using FP-Trees. IEEE Trans. Knowl. Data Eng. 17(10), 1347–1362 (2005)

    Article  Google Scholar 

  5. Han, J., Pei, J., Yin, Y., Mao, R.: Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach*. Data Min. Knowl. Disc. 8(1), 53–87 (2004)

    Article  MathSciNet  Google Scholar 

  6. Knuth, D.: The Art of Computer Programming, vol 3: Sorting and Searching. Addison Wesley, Reading (1973)

    Google Scholar 

  7. Lam, H.T., Calders, T.: Mining Top-K Frequent Items in a Data Stream with Flexible Sliding Windows. In: Proc. ACM SIGKDD, pp. 283–292 (2010)

    Google Scholar 

  8. Liu, G., Lu, H., Lou, W., Xu, Y., Yu, J.X.: Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree. Data Min. Knowl. Disc. 9(3), 249–274 (2004)

    Article  MathSciNet  Google Scholar 

  9. Liu, G., Lu, H., Yu, J.X., Wang, W., Xiao, X.: Afopt: An Efficient Implementation of Pattern Growth Approach. In: Proc. IEEE ICDM Workshop FIMI (2003)

    Google Scholar 

  10. Pietracaprina, A., Zandolin, D.: Mining Frequent Itemsets Using Patricia Tries*. In: Proc. IEEE ICDM Workshop FIMI (2003)

    Google Scholar 

  11. Schmidt-thieme, L.: Algorithmic Features of Eclat. In: Proc. IEEE ICDM Workshop FIMI (2004)

    Google Scholar 

  12. Tsay, Y.J., Hsu, T.J., Yu, J.R.: FIUT: A New Method for Mining Frequent Itemsets. Inf. Sci. 179(11), 1724–1737 (2009)

    Article  Google Scholar 

  13. Zaki, M.J., Gouda, K.: Fast Vertical Mining Using Diffsets. In: Proc. ACM SIGKDD, pp. 326–335 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Qu, JF., Liu, M. (2012). A Fast Algorithm for Frequent Itemset Mining Using Patricia* Structures. In: Cuzzocrea, A., Dayal, U. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2012. Lecture Notes in Computer Science, vol 7448. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32584-7_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32584-7_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32583-0

  • Online ISBN: 978-3-642-32584-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics