P2P RVM for Distributed Classification

Khan, Muhammad Umer; Nanopoulos, Alexandros; Schmidt-Thieme, Lars

doi:10.1007/978-3-662-44983-7_13

Muhammad Umer Khan²¹,
Alexandros Nanopoulos²² &
Lars Schmidt-Thieme²¹

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

2902 Accesses

Abstract

In recent years there is an increasing interest for analytical methods that learn patterns over large-scale data distributed over Peer-to-Peer (P2P) networks and support applications. Mining patterns in such distributed and dynamic environment is a challenging task, because centralization of data is not feasible. In this paper, we have proposed a distributed classification technique based on relevance vector machines (RVM) and local model exchange among neighboring peers in a P2P network. In such networks, the evaluation criteria for an efficient distributed classification algorithm is based on the size of resulting local models (communication efficiency) and their prediction accuracy. RVM utilizes dramatically fewer kernel functions than a state-of-the-art “support vector machine” (SVM), while demonstrating comparable generalization performance. This makes RVM a suitable choice to learn compact and accurate local models at each peer in a P2P network. Our model propagation approach, exchange resulting models with peers in a local neighborhood to produce more accurate network wide global model, while keeping the communication cost low throughout the network. Through extensive experimental evaluations, we demonstrate that by using more relevant and compact models, our approach outperforms the baseline model propagation approaches in terms of accuracy and communication cost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ang, H.-H., Gopalkrishnan, V., Hoi, S. C., & Ng, W. W. (2008). Cascade RSVM in Peer-to-Peer Networks. In European Conference on Machine Learning and Knowledge Discovery in Databases.
Google Scholar
Bhaduri, K., Wolff, R., Giannella, C., & Kargupta, H. (2008). Distributed decision-tree induction in peer-to-peer systems. Statistical Analysis and Data Mining, 1(2), 85–103.
Article MathSciNet Google Scholar
Caruana, G., & Li, M. (2012). A survey of emerging approaches to spam filtering. ACM Computing Surveys, 44(2), Article 9, 27.
Google Scholar
Chang, C.-C., & Lin, C.-J., LIBSVM. (2011). A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 27:1–27:27.
Google Scholar
Datta, S., Giannella, C., & Kargupta, H. (2009). Approximate distributed k-means clustering over a peer-to-peer network. Transactions on Knowledge and Data Engineering, 21(10), 1372–1388.
Article Google Scholar
Lee, Y.-J., & Mangasarian, O. L.(2001). RSVM: Reduced support vector machines. In First SIAM International Conference on Data Mining, 5–7.
Google Scholar
Luo, P., Xiong, H., Kevin, L., & Shi, Z. (2007). Distributed classification in peer-to-peer networks. In 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07)
Google Scholar
MacKay, D. J. (1996). Bayesian methods for back propagation networks. Models of neural networks III (pp. 211–254). New York: Springer
Book Google Scholar
Odysseas, P., Siberski, W., & Siersdorfer, S. (2011). Collaborative classification over P2P networks. In 20th International Conference Companion on World Wide Web (WWW ’11)
Google Scholar
Tipping, M. E. (2001). Sparse bayesian learning and the relevance vector machine. Journal of Machine Learning Research, 1, 211–244.
MATH MathSciNet Google Scholar
Wolff, R., & Schuster, A. (2004). Association rule mining in peer-to-peer systems. Transactions on Systems, Man, and Cybernetics, Part B, 34(6), 2426–2438.
Article Google Scholar

Download references

Acknowledgements

This work is funded by the Seventh Framework Program of European Commission, through the project REDUCTION (No. 288254). www.reduction-project.eu.

Author information

Authors and Affiliations

Information Systems and Machine Learning Lab, University of Hildesheim, Hildesheim, Germany
Muhammad Umer Khan & Lars Schmidt-Thieme
University of Eichstätt, Ingolstadt, Germany
Alexandros Nanopoulos

Authors

Muhammad Umer Khan
View author publications
You can also search for this author in PubMed Google Scholar
Alexandros Nanopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Lars Schmidt-Thieme
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muhammad Umer Khan .

Editor information

Editors and Affiliations

University of Essex, Colchester, United Kingdom
Berthold Lausen
University of Luxembourg, Walferdange, Luxembourg
Sabine Krolak-Schwerdt
University of Luxembourg, Walferdange, Luxembourg
Matthias Böhmer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khan, M.U., Nanopoulos, A., Schmidt-Thieme, L. (2015). P2P RVM for Distributed Classification. In: Lausen, B., Krolak-Schwerdt, S., Böhmer, M. (eds) Data Science, Learning by Latent Structures, and Knowledge Discovery. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44983-7_13

Download citation

DOI: https://doi.org/10.1007/978-3-662-44983-7_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44982-0
Online ISBN: 978-3-662-44983-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics