A Distributed Associative Classification Algorithm

Mokeddem, Djamila; Belbachir, Hafida

doi:10.1007/978-3-642-15211-5_12

Djamila Mokeddem⁵ &
Hafida Belbachir⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 315))

639 Accesses

Abstract

Associative classification algorithms have been successfully used to construct classification systems. The major strength of such techniques is that they are able to use the most accurate rules among an exhaustive list of class-association rules. This explains their good performance in general, but to the detriment of an expensive computing cost, inherited from association rules discovery algorithms. We address this issue by proposing a distributed methodology based on FP-growth algorithm. In a shared nothing architecture, subsets of classification rules are generated in parallel from several data partitions. An inter-processor communication is established in order to make global decisions. This exchange is made only in the first level of recursion, allowing each machine to subsequently process all its assigned tasks independently. The final classifier is built by a majority vote. This approach is illustrated by a detailed example, and an analysis of communication cost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Sharfer, J.: Parallel Mining of Association Rules. IEEE Transaction on Knowledge and Data Engineering 8(6), 962–969 (1996)
Article Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rule. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann, Santiago (1994)
Google Scholar
Alipio, M.J., Paulo, J.A.: An experiment with association rules and classification: Post-bagging and conviction. In: Hoffmann, A., Motoda, H., Scheffer, T. (eds.) DS 2005. LNCS (LNAI), vol. 3735, pp. 137–149. Springer, Heidelberg (2005)
Google Scholar
Buehrer, G., Parthasarathy, S., Tatikonda, S., Kurc, T., Saltz, J.: Toward terabyte pattern mining: An architecture-conscious solution. In: Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 2–12 (2007)
Google Scholar
Chawla, N., Eschrich, S., Hall, L.O.: Creating Ensembles of Classifiers. In: IEEE International Conference on Data Mining, pp. 580–581 (2001)
Google Scholar
Chen, D., Lai, C., Hu, W., Chen, W.G., Zhang, Y., Zheng, W.: Tree partition based parallel frequent pattern mining on shared memory systems. IEEE Parallel and Distributed Processing Symposium (2006)
Google Scholar
Cheung, W., Zaiane, O.R.: Incremental Mining of Frequent Patterns without Candidate Generation or Support Constraint. In: Seventh International Database Engineering and Applications Symposium (2003)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 1–12. ACM Press, Dallas (2000)
Chapter Google Scholar
Javed, A., Khokhar, A.: Frequent Pattern Mining on Message Passing Multiprocessor Systems. Distributed and Parallel database, 321–334 (2004)
Google Scholar
Li, W., Han, J.N., Pei, J.: CMAR: Accurate and efficient classification based on multiple-class association rule. In: Proceedings of the International Conference on Data Mining (ICDM 2001), San Jose, CA, pp. 369–376 (2001)
Google Scholar
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp. 80–86. AAAI Press, New York (1998)
Google Scholar
Moonesinghe, H.D.K., Moon-Jung, C., Pang-Ning, T.: Fast Parallel Mining of Frequent Item-sets (Technical Report MSU-CSE-06-29). Dept. of Computer Science and Engineering, Michigan State University (2006)
Google Scholar
Pramudiono, I., Kitsuregawa, M.: Shared nothing parallel execution of FP-growth. DBSJ Letters, v2 i1. 43-46 (2003)
Google Scholar
Quinlan, J.R.: C4.5 Programs for Machine Learning. Morgan Kaufmann Publishers, Inc., San Francisco (1993)
Google Scholar
Thabtah, F.: Pruning techniques in associative classification: Survey and comparison. Journal of Digital Information Management 4, 202–205 (2006)
Google Scholar
Thakur, G., Ramesh, C.J.: A Framework For Fast Classification Algorithms. International Journal Information Theories & Applications 15, 363–369 (2008)
Google Scholar
Yu, K.M., Zhou, J., Hsiao, W.C.: Load balancing approach parallel algorithm for frequent pattern mining. PaCT, 623–631 (2007)
Google Scholar
Zaiane, O., Lu, P.: Fast Parallel Association Rules Mining without Candidacy Generation. In: Proceeding of IEEE International Conference on Data Mining (ICDM 2001), pp. 665–668 (2001)
Google Scholar
Zhou, J., Yu, K.M.: Tidset-based parallel FP-tree algorithm for the frequent pattern mining problem on PC clusters. In: Proceeding of 3rd international conference on grid and pervasive computing, pp. 18–28 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Sciences, Laboratory of Signal Systems and Data LSSD, University of Sciences and Technologies Mohamed Boudiaf Oran, B.P 1505, Elmnaouer Oran, Algeria
Djamila Mokeddem & Hafida Belbachir

Authors

Djamila Mokeddem
View author publications
You can also search for this author in PubMed Google Scholar
Hafida Belbachir
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Abdelmalek Essaadi University, Tetuan, Morocco
Mohammad Essaaidi
Università di Catania, Italy
Michele Malgeri
University of Craiova, Romania
Costin Badica

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mokeddem, D., Belbachir, H. (2010). A Distributed Associative Classification Algorithm. In: Essaaidi, M., Malgeri, M., Badica, C. (eds) Intelligent Distributed Computing IV. Studies in Computational Intelligence, vol 315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15211-5_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-15211-5_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15210-8
Online ISBN: 978-3-642-15211-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics