Skip to main content

Task-Parallel FP-Growth on Cluster Computers

  • Conference paper
  • First Online:
Book cover Computer and Information Sciences

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 62))

Abstract

Frequent itemset mining (FIM) is one of the most deeply studied data mining task. A number of algorithms, employing different approaches and advanced data structures, have already been proposed to solve the task efficiently. Even the fastest serial FIM algorithms fail to scale up with the rapid growth of database sizes. Hence, parallel FIM algorithms are the only viable solutions in many domains as serial so- lutions have almost reached the physical barriers. To this end, parallel versions of a few serial FIM algorithms, including FP-Growth, have al- ready been developed. In this study, we develop three different parallel FP-Growth implementations for cluster computers. They, all MPI based, are (i) Static Parallel FP-Growth, (ii) Dynamic Parallel FP-Growth, and (iii) (Tree-Sharing) Dynamic Parallel FP-Growth. All the three variants are task-parallel, i.e., not based on horizontal or vertical partitioning of database. The algorithms are experimentally evaluated on a 16-node cluster computer. Our results demonstrate the utility of the algorithms.

Supported by TUBITAK, grant number 108E016

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Agrawal, T. Imielienski, and A. Swami. Mining association rules between sets of items in large databases. In SIGMOD ’93, pages 207–216, 1993.

    Google Scholar 

  2. R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB’94, pages 487–499, 1994.

    Google Scholar 

  3. J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD 2000), pages 1–12, 2000.

    Google Scholar 

  4. Y-J. Lan and Y. Qiu. Parallel frequent itemsets mining algorithms without interme-diate results. In Proceedings of 2005 International Conference on Machine Learning and Cybernetics, pages 2102–2107, 2005.

    Google Scholar 

  5. H. Li, Y. Wang, D. Zhang, M. Zhang, and E.Y. Chang. Pfp: Parallel fp-growth for query recommendation. In Proceedings of the 2008 ACM Conference on Recommender Systems, pages 107–114, 2008.

    Google Scholar 

  6. G.O. Ozdogan, O. Abul, and A. Yazici. Paralel veri madenciligi algoritmalari. In Proceedings of the first National High-Performance and Grid Computing Conference, pages 131–137, 2009 (in Turkish).

    Google Scholar 

  7. I. Pramudiono and M. Kitsuregawa. Parallel fp-growth on pc cluster. In Proceedings of the 7th Pacific-Asia Conference of Knowledge Discovery and Data Mining, pages 467–473, 2003.

    Google Scholar 

  8. A. Savasere, E. Omiecinski, and S. Navathe. An e±cient algorithm for mining association rules in large databases. In Proceedings of the 21st International Conference on Very Large Databases (VLDB’95), pages 432–444, 1995.

    Google Scholar 

  9. O.R. Zaiane, M. El-Hajj, and P. Lu. Fast parallel association rule mining without candidacy generation. In Proceedings of the 2001 IEEE International Conference on Data Mining, pages 665–668, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gülistan Özdemir Özdogan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media B.V.

About this paper

Cite this paper

Özdogan, G.Ö., Abul, O. (2011). Task-Parallel FP-Growth on Cluster Computers. In: Gelenbe, E., Lent, R., Sakellari, G., Sacan, A., Toroslu, H., Yazici, A. (eds) Computer and Information Sciences. Lecture Notes in Electrical Engineering, vol 62. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9794-1_71

Download citation

  • DOI: https://doi.org/10.1007/978-90-481-9794-1_71

  • Published:

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-90-481-9793-4

  • Online ISBN: 978-90-481-9794-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics