Skip to main content

Distributed and Shared Memory Algorithm for Parallel Mining of Association Rules

  • Conference paper
Book cover Machine Learning and Data Mining in Pattern Recognition (MLDM 2007)

Abstract

The search for frequent patterns in transactional databases is considered one of the most important data mining problems. Several parallel and sequential algorithms have been proposed in the literature to solve this problem. Almost all of these algorithms make repeated passes over the dataset to determine the set of frequent itemsets, thus implying high I/O overhead. In the parallel case, most algorithms perform a sum-reduction at the end of each pass to construct the global counts, also implying high synchronization cost. We present a novel algorithm that exploits efficiently the trade-offs between computation, communication, memory usage and synchronization. The algorithm was implemented over a cluster of SMP nodes combining distributed and shared memory paradigms. This paper presents the results of our algorithm on different data sizes experimented on different numbers of processors, and studies the effect of these variations on the overall performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Fayyad, U., et al. (eds.) Advances in Knowledge Discovery and Data Mining, MIT Press, Cambridge (1996)

    Google Scholar 

  2. Agrawal, R., Shafer, J.: Parallel mining of association rules. IEEE Trans. on Knowledge and Data Engg. 8(6), 962–969 (1996)

    Article  Google Scholar 

  3. Park, J.S., Chen, M., Yu, P.S.: Efficient parallel data mining for association rules. In: ACM Intl. Conf. Information and Knowledge Management (November 1995)

    Google Scholar 

  4. Cheung, D., Ng, V., Fu, A., Fu, Y.: Efficient mining of association rules in distributed databases. IEEE Trans. on Knowledge and Data Engg. 8(6), 911–922 (1996)

    Article  Google Scholar 

  5. Cheung, D., Han, J., Ng, V., Fu, A., Fu, Y.: A fast distributed algorithm for mining association rules. In: 4th Intl. Conf. Parallel and Distributed Info. Systems (December 1996)

    Google Scholar 

  6. Cheung, D.W., Xiao, Y.: Effect of Data Skewness in Parallel Mining of Association Rules. In: Proceedings of the 2nd Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 48–60, Melbourne, Australia (April 1998)

    Google Scholar 

  7. Ashrafi, M.Z., Taniar, D., Smith, K.A.: ODAM: An Optimized Distributed Association Rule Mining Algorithm. IEEE Distributed Systems Online (5) (2004)

    Google Scholar 

  8. Freitas, A.A., Lavington, S.H.: Mining Very Large Databases with Parallel Processing. Kluwer Academic Publishers, Boston (1998)

    MATH  Google Scholar 

  9. Freitas, A.A.: A Survey of Parallel Data Mining. In: Proc. 2nd Int. Conf. on the Practical Applications of Knowledge Discovery and Data Mining (1998)

    Google Scholar 

  10. Skillicorn, D.: Parallel Data Mining, Department of Computing and Information Science Queen’s University, Kingston (1999)

    Google Scholar 

  11. Palancar, J.H., León, R.H., Pagola, J.M., Díaz, A.H.: Mining Frequent Patterns Using Compressed Vertical Binary Representations In: Lin, T.Y., Xie, Y. (eds.) Proceedings of a Workshop Foundation of Semantic Oriented Data and Web Mining, held in Conjunction with the Fifth IEEE International Conference on Data Mining, Houston, Texas, USA, November 27-30, 2005 pp. 29–33 (2005), ISBN 0-9738918-7-4

    Google Scholar 

  12. Palancar, J.H., León, R.H., Pagola, J.M., Díaz, A.H.: A Compressed Vertical Binary Algorithm for Mining Frequent Patterns. In: The book Data Mining: Foundations and Practice Lin, T.Y., Wasilewska, A., Petry, F., Xie, Y.(eds.) Springer, Accepted for publication (to appear)

    Google Scholar 

  13. Orlando, S., Palmerini, P., Perego, R., Silvestri, F.: A Scalable Multi-Strategy Algorithm for Counting Frequent Sets Washington, USA, pp. 19–30. In: Proceedings of the 5th Workshop on High Performance Data Mining, in conjunction with Second International SIAM Conference on Data Mining (April 2002)

    Google Scholar 

  14. Schuster, A., Wolff, R.: Communication-Efficient Distributed Mining of Association Rules. Data Mining and Knowledge Discovery, 8(2) (March 2004)

    Google Scholar 

  15. Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. In: Heckerman, D., Mannila, H., Pregibon, D., Uthurusamy, R. (eds.) KDD 1997. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, p. 283. AAAI Press, Stanford (1997)

    Google Scholar 

  16. Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New Parallel Algorithms for Fast Discovery of Association Rules. Data Mining and Knowledge Discovery 1(4), 343–373 (1997)

    Article  Google Scholar 

  17. Zaki, M.J., Parthasarathy, S., Li, W.: A Localized Algorithm for Parallel Association Mining. In: Proceedings of the 9th ACM Symposium on Parallel Algorithms and Architectures (1997)

    Google Scholar 

  18. Zaki, M.J.: Parallel and Distributed Association Mining: A Survey. IEEE Concurrency (October- December 1999)

    Google Scholar 

  19. Zaki, M., Parthasarhaty, S., Ogihara, M., Li, W.: Parallel Data Mining for Association Rules on Shared Memory Systems (February 28, 2001)

    Google Scholar 

  20. Zaiane, O.R., El-Hajj, M., Lu, P.: Fast Parallel Association Rule Mining without candidacy generation. Techical Report TR01-12. Department of Computing Sciences, University of Alberta, Canada (2001)

    Google Scholar 

  21. Shintani, T., Kitsuregawa, M.: Parallel Mining Algorithms for Generalized Association Rules with Classification Hierarchy. In: Proceedings ACM SIGMOD International Conference on Management of Data, SIGMOD 1998, Seattle, Washington, USA (June 2-4, 1998)

    Google Scholar 

  22. Sam, E.-H., Karypis, H.G., Kumar, V.: Scalable Parallel Data Mining for Association Rules. Department of Computer Science. University of Minnesota (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Palancar, J.H., Tormo, O.F., Cárdenas, J.F., León, R.H. (2007). Distributed and Shared Memory Algorithm for Parallel Mining of Association Rules. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2007. Lecture Notes in Computer Science(), vol 4571. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73499-4_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73499-4_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73498-7

  • Online ISBN: 978-3-540-73499-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics